Re: [Numpy-discussion] performance matrix multiplication vs. matlab
2009/6/8 Gael Varoquaux gael.varoqu...@normalesup.org: On Mon, Jun 08, 2009 at 12:29:08AM -0400, David Warde-Farley wrote: On 7-Jun-09, at 6:12 AM, Gael Varoquaux wrote: Well, I do bootstrapping of PCAs, that is SVDs. I can tell you, it makes a big difference, especially since I have 8 cores. Just curious Gael: how many PC's are you retaining? Have you tried iterative methods (i.e. the EM algorithm for PCA)? I am using the heuristic exposed in http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4562996 We have very noisy and long time series. My experience is that most model-based heuristics for choosing the number of PCs retained give us way too much on this problem (they simply keep diverging if I add noise at the end of the time series). The algorithm we use gives us ~50 interesting PCs (each composed of 50 000 dimensions). That happens to be quite right based on our experience with the signal. However, being fairly new to statistics, I am not aware of the EM algorithm that you mention. I'd be interested in a reference, to see if I can use that algorithm. The PCA bootstrap is time-consuming. Hi, Given the number of PCs, I think you may just be measuring noise. As said in several manifold reduction publications (as the ones by Torbjorn Vik who published on robust PCA for medical imaging), you cannot expect to have more than 4 or 5 meaningful PCs, due to the dimensionality curse. If you want 50 PCs, you have to have at least... 10^50 samples, which is quite a lot, let's say it this way. According to the litterature, a usual manifold can be described by 4 or 5 variables. If you have more, it is that you may be infringing your hypothesis, here the linearity of your data (and as it is medical imaging, you know from the beginning that this hypothesis is wrong). So if you really want to find something meaningful and/or physical, you should use a real dimensionality reduction, preferably a non-linear one. Just my 2 cents ;) Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Functions for indexing into certain parts of an array (2d)
On Sun, Jun 7, 2009 at 6:19 AM, Bruce Southeybsout...@gmail.com wrote: On Sun, Jun 7, 2009 at 3:37 AM, Fernando Perezfperez@gmail.com wrote: One more question. For these *_indices() functions, would you want an interface that accepts *either* diag_indices(size,ndim) As I indicated above, this is unacceptable for the apparent usage. Relax, nobody is trying to sneak past the Committee for the Prevention of Unacceptable Things. It's all now in a patch attached to this ticket: http://projects.scipy.org/numpy/ticket/1132 for regular review. I added the functions, with docstrings and tests. By the way, despite being above indicated as unacceptable, I still see value in being able to create these indexing structures without an actual array, so the implementation contains both versions, but with different names (to avoid the shenanigans that Robert rightfully has a policy of avoiding). I do not understand what is expected with the ndim argument. If it is the indices of an array elements of the form: [0][0][0], [1][1][1], ... [k][k][k] where k=min(a.shape) for some array a then an ndim args is total redundant (although using shape is not correct for 1-d arrays). This is different than the diagonals of two 2-d arrays from an shape of 2 by 3 by 4 or some other expectation. For an n-dimensional array, which probably comes closest to what we think of as a tensor in (muti) linear algebra, the notion of a diagonal as the list of entries with indices A[i,i,i,,i] for i in [0...N] is a very natural one. diag_indices(anarray) +1 These were also added in this form, with the name _from(A), indicating that size/shape information should be taken from the input A. So both versions exist. Feel free to provide further feedback on the patch. Cheers, f ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] performance matrix multiplication vs. matlab
On 8-Jun-09, at 1:17 AM, David Cournapeau wrote: I would not be surprised if David had this paper in mind :) http://www.cs.toronto.edu/~roweis/papers/empca.pdf Right you are :) There is a slight trick to it, though, in that it won't produce an orthogonal basis on its own, just something that spans that principal subspace. So you typically have to at least extract the first PC independently to uniquely orient your basis. You can then either subtract off the projection of the data on the 1st PC and find the next one, one at at time, or extract a spanning set all at once and orthogonalize with respect to the first PC. David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] performance matrix multiplication vs. matlab
On Mon, Jun 08, 2009 at 08:58:29AM +0200, Matthieu Brucher wrote: Given the number of PCs, I think you may just be measuring noise. As said in several manifold reduction publications (as the ones by Torbjorn Vik who published on robust PCA for medical imaging), you cannot expect to have more than 4 or 5 meaningful PCs, due to the dimensionality curse. If you want 50 PCs, you have to have at least... 10^50 samples, which is quite a lot, let's say it this way. According to the litterature, a usual manifold can be described by 4 or 5 variables. If you have more, it is that you may be infringing your hypothesis, here the linearity of your data (and as it is medical imaging, you know from the beginning that this hypothesis is wrong). So if you really want to find something meaningful and/or physical, you should use a real dimensionality reduction, preferably a non-linear one. I am not sure I am following you: I have time-varying signals. I am not taking a shot of the same process over and over again. My intuition tells me that I have more than 5 meaningful patterns. Anyhow, I do some more analysis behind that (ICA actually), and I do find more than 5 patterns of interest that I not noise. So maybe I should be using some non-linear dimensionality reduction, but what I am doing works, and I can write a generative model of it. Most importantly, it is actually quite computationaly simple. However, if you can point me to methods that you believe are better (and tell me why you believe so), I am all ears. Gaël ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] performance matrix multiplication vs. matlab
2009/6/8 Gael Varoquaux gael.varoqu...@normalesup.org: On Mon, Jun 08, 2009 at 08:58:29AM +0200, Matthieu Brucher wrote: Given the number of PCs, I think you may just be measuring noise. As said in several manifold reduction publications (as the ones by Torbjorn Vik who published on robust PCA for medical imaging), you cannot expect to have more than 4 or 5 meaningful PCs, due to the dimensionality curse. If you want 50 PCs, you have to have at least... 10^50 samples, which is quite a lot, let's say it this way. According to the litterature, a usual manifold can be described by 4 or 5 variables. If you have more, it is that you may be infringing your hypothesis, here the linearity of your data (and as it is medical imaging, you know from the beginning that this hypothesis is wrong). So if you really want to find something meaningful and/or physical, you should use a real dimensionality reduction, preferably a non-linear one. I am not sure I am following you: I have time-varying signals. I am not taking a shot of the same process over and over again. My intuition tells me that I have more than 5 meaningful patterns. How many samples do you have? 1? a million? a billion? The problem with 50 PCs is that your search space is mostly empty, thanks to the curse of dimensionality. This means that you *should* not try to get a meaning for the 10th and following PCs. Anyhow, I do some more analysis behind that (ICA actually), and I do find more than 5 patterns of interest that I not noise. ICa suffers from the same problems than PCA. And I'm not even talking about the linearity hypothesis that is never respected. So maybe I should be using some non-linear dimensionality reduction, but what I am doing works, and I can write a generative model of it. Most importantly, it is actually quite computationaly simple. Thanks linearity ;) The problem is that you will have a lot of confounds this way (your 50 PCs can in fact be the effect of 5 variables that are nonlinear). However, if you can point me to methods that you believe are better (and tell me why you believe so), I am all ears. My thesis was on nonlinear dimensionality reduction (this is why I believe so, especially in the medical imaging field), but it always need some adaptation. It depends on what you want to do, the time you can use to process data, ... Suffice to say we started with PCA some years ago and we were switching to nonlinear reduction because of the emptiness of the search space and because of the nonlinearity of the brain space (no idea what my former lab is doing now, but it is used for DTI at least). You should check some books on it, and you surely have to read something about the curse of dimensionality (at least if you want to get published, as people know about this issue in the medical field), even if you do not use nonlinear techniques. Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] performance matrix multiplication vs. matlab
2009/6/8 David Warde-Farley d...@cs.toronto.edu: On 8-Jun-09, at 1:17 AM, David Cournapeau wrote: I would not be surprised if David had this paper in mind :) http://www.cs.toronto.edu/~roweis/papers/empca.pdf Right you are :) There is a slight trick to it, though, in that it won't produce an orthogonal basis on its own, just something that spans that principal subspace. So you typically have to at least extract the first PC independently to uniquely orient your basis. You can then either subtract off the projection of the data on the 1st PC and find the next one, one at at time, or extract a spanning set all at once and orthogonalize with respect to the first PC. David Also Ch. Bishop has an article on using EM for PCA, Probabilistic Principal Components Analysis where I think he proves the equivalence as well. Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] scipy 0.7.1rc2 released
2009/6/8 Matthieu Brucher matthieu.bruc...@gmail.com: I'm trying to compile it with ICC 10.1.018, and it fails :| icc: scipy/special/cephes/const.c scipy/special/cephes/const.c(94): error: floating-point operation result is out of range double INFINITY = 1.0/0.0; /* 99e999; */ ^ scipy/special/cephes/const.c(99): error: floating-point operation result is out of range double NAN = 1.0/0.0 - 1.0/0.0; ^ scipy/special/cephes/const.c(99): error: floating-point operation result is out of range double NAN = 1.0/0.0 - 1.0/0.0; ^ compilation aborted for scipy/special/cephes/const.c (code 2) scipy/special/cephes/const.c(94): error: floating-point operation result is out of range double INFINITY = 1.0/0.0; /* 99e999; */ ^ scipy/special/cephes/const.c(99): error: floating-point operation result is out of range double NAN = 1.0/0.0 - 1.0/0.0; ^ scipy/special/cephes/const.c(99): error: floating-point operation result is out of range double NAN = 1.0/0.0 - 1.0/0.0; At least, it seems to pick up the Fortran compiler correctly (which 0.7.0 didn't seem to do ;)) I manually fixed the files (mconf.h as well as ync.c which uses NAN, which can be not imported if NANS is defined, which is the case here for ICC), but I ran into an another error (one of the reason I tried numscons before): /appli/intel/10.1.018/intel64/fce/bin/ifort -shared -shared -nofor_main build/temp.linux-x86_64-2.5/build/src.linux-x86_64-2.5/scipy/fftpack/_fftpackmodule.o build/temp.linux-x86_64-2.5/scipy/fftpack/src/zfft.o build/temp.linux-x86_64-2.5/scipy/fftpack/src/drfft.o build/temp.linux-x86_64-2.5/scipy/fftpack/src/zrfft.o build/temp.linux-x86_64-2.5/scipy/fftpack/src/zfftnd.o build/temp.linux-x86_64-2.5/build/src.linux-x86_64-2.5/fortranobject.o -Lbuild/temp.linux-x86_64-2.5 -ldfftpack -o build/lib.linux-x86_64-2.5/scipy/fftpack/_fftpack.so ld: build/temp.linux-x86_64-2.5/libdfftpack.a(dffti1.o): relocation R_X86_64_32S against `a local symbol' can not be used when making a shared object; recompile with -fPIC build/temp.linux-x86_64-2.5/libdfftpack.a: could not read symbols: Bad value ld: build/temp.linux-x86_64-2.5/libdfftpack.a(dffti1.o): relocation R_X86_64_32S against `a local symbol' can not be used when making a shared object; recompile with -fPIC build/temp.linux-x86_64-2.5/libdfftpack.a: could not read symbols: Bad value It seems that the library is not compiled with fPIC (perhaps because it is a static library?). My compiler options are: Fortran f77 compiler: ifort -FI -w90 -w95 -xW -axP -O3 -unroll Fortran f90 compiler: ifort -FR -xW -axP -O3 -unroll Fortran fix compiler: ifort -FI -xW -axP -O3 -unroll Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] scipy 0.7.1rc2 released
I'm trying to compile it with ICC 10.1.018, and it fails :| icc: scipy/special/cephes/const.c scipy/special/cephes/const.c(94): error: floating-point operation result is out of range double INFINITY = 1.0/0.0; /* 99e999; */ ^ scipy/special/cephes/const.c(99): error: floating-point operation result is out of range double NAN = 1.0/0.0 - 1.0/0.0; ^ scipy/special/cephes/const.c(99): error: floating-point operation result is out of range double NAN = 1.0/0.0 - 1.0/0.0; ^ compilation aborted for scipy/special/cephes/const.c (code 2) scipy/special/cephes/const.c(94): error: floating-point operation result is out of range double INFINITY = 1.0/0.0; /* 99e999; */ ^ scipy/special/cephes/const.c(99): error: floating-point operation result is out of range double NAN = 1.0/0.0 - 1.0/0.0; ^ scipy/special/cephes/const.c(99): error: floating-point operation result is out of range double NAN = 1.0/0.0 - 1.0/0.0; At least, it seems to pick up the Fortran compiler correctly (which 0.7.0 didn't seem to do ;)) Matthieu 2009/6/7 Adam Mercer ramer...@gmail.com: On Fri, Jun 5, 2009 at 06:09, David Cournapeaucourn...@gmail.com wrote: Please test it ! I am particularly interested in results for scipy binaries on mac os x (do they work on ppc). Test suite passes on Intel Mac OS X (10.5.7) built from source: OK (KNOWNFAIL=6, SKIP=21) nose.result.TextTestResult run=3486 errors=0 failures=0 Cheers Adam ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] scipy 0.7.1rc2 released
Matthieu Brucher wrote: I'm trying to compile it with ICC 10.1.018, and it fails :| icc: scipy/special/cephes/const.c scipy/special/cephes/const.c(94): error: floating-point operation result is out of range double INFINITY = 1.0/0.0; /* 99e999; */ ^ scipy/special/cephes/const.c(99): error: floating-point operation result is out of range double NAN = 1.0/0.0 - 1.0/0.0; ^ scipy/special/cephes/const.c(99): error: floating-point operation result is out of range double NAN = 1.0/0.0 - 1.0/0.0; ^ compilation aborted for scipy/special/cephes/const.c (code 2) scipy/special/cephes/const.c(94): error: floating-point operation result is out of range double INFINITY = 1.0/0.0; /* 99e999; */ ^ scipy/special/cephes/const.c(99): error: floating-point operation result is out of range double NAN = 1.0/0.0 - 1.0/0.0; ^ scipy/special/cephes/const.c(99): error: floating-point operation result is out of range double NAN = 1.0/0.0 - 1.0/0.0; At least, it seems to pick up the Fortran compiler correctly (which 0.7.0 didn't seem to do ;)) This code makes me cry... I know Visual Studio won't like it either. Cephes is a constant source of problems . As I mentioned a couple of months ago, I think the only solution is to rewrite most of scipy.special, at least the parts using cephes, using for example boost algorithms and unit tests. But I have not started anything concrete - Pauli did most of the work on scipy.special recently (Kudos to Pauli for consistently improving scipy.special, BTW) cheers, David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] scipy 0.7.1rc2 released
2009/6/8 David Cournapeau da...@ar.media.kyoto-u.ac.jp: Matthieu Brucher wrote: I'm trying to compile it with ICC 10.1.018, and it fails :| icc: scipy/special/cephes/const.c scipy/special/cephes/const.c(94): error: floating-point operation result is out of range double INFINITY = 1.0/0.0; /* 99e999; */ ^ scipy/special/cephes/const.c(99): error: floating-point operation result is out of range double NAN = 1.0/0.0 - 1.0/0.0; ^ scipy/special/cephes/const.c(99): error: floating-point operation result is out of range double NAN = 1.0/0.0 - 1.0/0.0; ^ compilation aborted for scipy/special/cephes/const.c (code 2) scipy/special/cephes/const.c(94): error: floating-point operation result is out of range double INFINITY = 1.0/0.0; /* 99e999; */ ^ scipy/special/cephes/const.c(99): error: floating-point operation result is out of range double NAN = 1.0/0.0 - 1.0/0.0; ^ scipy/special/cephes/const.c(99): error: floating-point operation result is out of range double NAN = 1.0/0.0 - 1.0/0.0; At least, it seems to pick up the Fortran compiler correctly (which 0.7.0 didn't seem to do ;)) This code makes me cry... I know Visual Studio won't like it either. Cephes is a constant source of problems . As I mentioned a couple of months ago, I think the only solution is to rewrite most of scipy.special, at least the parts using cephes, using for example boost algorithms and unit tests. But I have not started anything concrete - Pauli did most of the work on scipy.special recently (Kudos to Pauli for consistently improving scipy.special, BTW) cheers, It could be simply enhanced by refactoring only mconf.h with proper compiler flags, and fix yn.c to remove the NAN detection (as it should be in the mconf.h). Unfortunately, I have no time for this at the moment (besides the fact that it is on my workstation, not at home). Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] extract elements of an array that are contained in another array?
Hi Josef, thanks for the summary! I am responding below, later I will make an enhancement ticket. josef.p...@gmail.com wrote: On Sat, Jun 6, 2009 at 4:42 AM, Neil Crighton neilcrigh...@gmail.com wrote: Robert Cimrman cimrman3 at ntc.zcu.cz writes: Anne Archibald wrote: 1. add a keyword argument to intersect1d assume_unique; if it is not present, check for uniqueness and emit a warning if not unique 2. change the warning to an exception Optionally: 3. change the meaning of the function to that of intersect1d_nu if the keyword argument is not present 1. merge _nu version into one function --- You mean something like: def intersect1d(ar1, ar2, assume_unique=False): if not assume_unique: return intersect1d_nu(ar1, ar2) else: ... # the current code intersect1d_nu could be still exported to numpy namespace, or not. +1 - from the user's point of view there should just be intersect1d and setmember1d (i.e. no '_nu' versions). The assume_unique keyword Robert suggests can be used if speed is a problem. + 1 on rolling the _nu versions this way into the plain version, this would avoid a lot of the confusion. It would not be a code breaking API change for existing correct usage (but some speed regression without adding keyword) +1 depreciate intersect1d_nu ^^ intersect1d_nu could be still exported to numpy namespace, or not. I would say not, if they are the default branch of the non _nu version +1 on depreciation +0 2. alias as in - I really like in1d (no underscore) as a new name for setmember1d_nu. inarray is another possibility. I don't like 'ain'; 'a' in front of 'in' detracts from readability, unlike the extra a in arange. I don't like the extra as either, ones name spaces are commonly used alias setmember1d_nu as `in1d` or `isin1d`, because the function is a in and not a set operation +1 +1 3. behavior of other set functions --- guarantee that setdiff1d works for non-unique arrays (even when implementation changes), and change documentation +1 +1, it is useful for non-unique arrays. need to check other functions ^^ union1d: works for non-unique arrays, obvious from source Yes. setxor1d: requires unique arrays np.setxor1d([1,2,3,3,4,5], [0,0,1,2,2,6]) array([2, 4, 5, 6]) np.setxor1d(np.unique([1,2,3,3,4,5]), np.unique([0,0,1,2,2,6])) array([0, 3, 4, 5, 6]) setxor: add keyword option and call unique by default +1 for symmetry +1 - you mean np.setxor1d(np.unique(a), np.unique(b)) to become np.setxor1d(a, b, assume_unique=False), right? ediff1d and unique1d are defined for non-unique arrays yes 4. name of keyword intersect1d(ar1, ar2, assume_unique=False) alternative isunique=False or just unique=False +1 less to write We should look at other functions in numpy (and/or scipy), what is a common scheme here. -1e-1 to the proposed names, as isunique is singular only, and unique=False does not show clearly the intent for me. What about ar1_unique=False, ar2_unique=False - to address each argument specifically? 5. module name --- rename arraysetops to something easier to read like setfun. I think it would only affect internal changes since all functions are exported to the main numpy name space +1e-4 (I got used to arrayse_tops) +0 (internal change only). Other numpy/scipy submodules containing a bunch of functions are called *pack (fftpack, arpack, lapack), *alg (linalg), *utils. *fun is used comonly in the matlab world. 5. keep docs in sync with correct usage - obvious +1 thanks, r. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] scipy 0.7.1rc2 released
On Mon, Jun 8, 2009 at 8:45 PM, Matthieu Bruchermatthieu.bruc...@gmail.com wrote: 2009/6/8 David Cournapeau da...@ar.media.kyoto-u.ac.jp: Matthieu Brucher wrote: I'm trying to compile it with ICC 10.1.018, and it fails :| icc: scipy/special/cephes/const.c scipy/special/cephes/const.c(94): error: floating-point operation result is out of range double INFINITY = 1.0/0.0; /* 99e999; */ ^ scipy/special/cephes/const.c(99): error: floating-point operation result is out of range double NAN = 1.0/0.0 - 1.0/0.0; ^ scipy/special/cephes/const.c(99): error: floating-point operation result is out of range double NAN = 1.0/0.0 - 1.0/0.0; ^ compilation aborted for scipy/special/cephes/const.c (code 2) scipy/special/cephes/const.c(94): error: floating-point operation result is out of range double INFINITY = 1.0/0.0; /* 99e999; */ ^ scipy/special/cephes/const.c(99): error: floating-point operation result is out of range double NAN = 1.0/0.0 - 1.0/0.0; ^ scipy/special/cephes/const.c(99): error: floating-point operation result is out of range double NAN = 1.0/0.0 - 1.0/0.0; At least, it seems to pick up the Fortran compiler correctly (which 0.7.0 didn't seem to do ;)) This code makes me cry... I know Visual Studio won't like it either. Cephes is a constant source of problems . As I mentioned a couple of months ago, I think the only solution is to rewrite most of scipy.special, at least the parts using cephes, using for example boost algorithms and unit tests. But I have not started anything concrete - Pauli did most of the work on scipy.special recently (Kudos to Pauli for consistently improving scipy.special, BTW) cheers, It could be simply enhanced by refactoring only mconf.h with proper compiler flags, and fix yn.c to remove the NAN detection (as it should be in the mconf.h). NAN and co definition should be dealt with the portable definitions we have now in numpy - I just have to find a way to reuse the corresponding code outside numpy (distutils currently does not handle proper installation of libraries built through build_clib), it is on my TODO list for scipy 0.8. Unfortunately, this is only the tip of the iceberg. A lot of code in cephes uses #ifdef on platform specificities, and let's not forget it is pre-ANSI C code (KR declarations), with a lot of hidden bugs.\ cheers, David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] scipy 0.7.1rc2 released
Good luck with fixing this then :| I've tried to build scipy with the MKL and ATLAS, and I have in both cases a segmentation fault. With the MKL, it is the same as in a previous mail, and for ATLAS it is there: Regression test for #946. ... Segmentation fault A bad ATLAS compilation? Matthieu It could be simply enhanced by refactoring only mconf.h with proper compiler flags, and fix yn.c to remove the NAN detection (as it should be in the mconf.h). NAN and co definition should be dealt with the portable definitions we have now in numpy - I just have to find a way to reuse the corresponding code outside numpy (distutils currently does not handle proper installation of libraries built through build_clib), it is on my TODO list for scipy 0.8. Unfortunately, this is only the tip of the iceberg. A lot of code in cephes uses #ifdef on platform specificities, and let's not forget it is pre-ANSI C code (KR declarations), with a lot of hidden bugs.\ cheers, David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] scipy 0.7.1rc2 released
2009/6/8 David Cournapeau da...@ar.media.kyoto-u.ac.jp: Matthieu Brucher wrote: Good luck with fixing this then :| I've tried to build scipy with the MKL and ATLAS, and I have in both cases a segmentation fault. With the MKL, it is the same as in a previous mail, and for ATLAS it is there: Regression test for #946. ... Segmentation fault Could you try the last revision in the 0.7.x branch ? There were quite a few problems with this exact code (that's the only reason why scipy 0.7.1 is not released yet, actually), and I added an ugly workaround for the time being, but that should work, David Is there a way to get a tarball from the repository on the webpage? I can't do a checkout (no TortoiseSVN installed on my Windows and no web access from Linux :() Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] performance matrix multiplication vs. matlab
Note that EM can be very slow to converge: http://www.cs.toronto.edu/~roweis/papers/emecgicml03.pdf EM is great for churning-out papers, not so great for getting real work done. Conjugate gradient is a much better tool, at least in my (and Salakhutdinov's) experience. Btw, have you considered how much the Gaussianity assumption is hurting you? Jason On Mon, Jun 8, 2009 at 1:17 AM, David Cournapeau da...@ar.media.kyoto-u.ac.jp wrote: Gael Varoquaux wrote: I am using the heuristic exposed in http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4562996 We have very noisy and long time series. My experience is that most model-based heuristics for choosing the number of PCs retained give us way too much on this problem (they simply keep diverging if I add noise at the end of the time series). The algorithm we use gives us ~50 interesting PCs (each composed of 50 000 dimensions). That happens to be quite right based on our experience with the signal. However, being fairly new to statistics, I am not aware of the EM algorithm that you mention. I'd be interested in a reference, to see if I can use that algorithm. I would not be surprised if David had this paper in mind :) http://www.cs.toronto.edu/~roweis/papers/empca.pdfhttp://www.cs.toronto.edu/%7Eroweis/papers/empca.pdf cheers, David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Jason Rennie Research Scientist, ITA Software 617-714-2645 http://www.itasoftware.com/ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] performance matrix multiplication vs. matlab
On Mon, Jun 08, 2009 at 08:33:11AM -0400, Jason Rennie wrote: EM is great for churning-out papers, not so great for getting real work done.� That's just what I thought. Btw, have you considered how much the Gaussianity assumption is hurting you? I have. And the answer is: not much. But then, my order-selection method is just about selecting the non-gaussian components. And the non-orthogonality of the interessing 'indedpendant' signals is small, in that subspace. Ga�l ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] performance matrix multiplication vs. matlab
On Mon, Jun 8, 2009 at 3:29 AM, Gael Varoquaux gael.varoqu...@normalesup.org wrote: On Mon, Jun 08, 2009 at 08:58:29AM +0200, Matthieu Brucher wrote: Given the number of PCs, I think you may just be measuring noise. As said in several manifold reduction publications (as the ones by Torbjorn Vik who published on robust PCA for medical imaging), you cannot expect to have more than 4 or 5 meaningful PCs, due to the dimensionality curse. If you want 50 PCs, you have to have at least... 10^50 samples, which is quite a lot, let's say it this way. According to the litterature, a usual manifold can be described by 4 or 5 variables. If you have more, it is that you may be infringing your hypothesis, here the linearity of your data (and as it is medical imaging, you know from the beginning that this hypothesis is wrong). So if you really want to find something meaningful and/or physical, you should use a real dimensionality reduction, preferably a non-linear one. I am not sure I am following you: I have time-varying signals. I am not taking a shot of the same process over and over again. My intuition tells me that I have more than 5 meaningful patterns. Anyhow, I do some more analysis behind that (ICA actually), and I do find more than 5 patterns of interest that I not noise. Just curious: whats the actual shape of the array/data you run your PCA on. Number of time periods, size of cross section at point in time? Josef ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] performance matrix multiplication vs. matlab
Jason Rennie wrote: Note that EM can be very slow to converge: http://www.cs.toronto.edu/~roweis/papers/emecgicml03.pdf http://www.cs.toronto.edu/%7Eroweis/papers/emecgicml03.pdf EM is great for churning-out papers, not so great for getting real work done. I think it depends on what you are doing - EM is used for 'real' work too, after all :) Conjugate gradient is a much better tool, at least in my (and Salakhutdinov's) experience. Thanks for the link, I was not aware of this work. What is the difference between the ECG method and the method proposed by Lange in [1] ? To avoid 'local trapping' of the parameter in EM methods, recursive EM [2] may also be a promising method, also it seems to me that it has not been used so much, but I may well be wrong (I have seen several people using a simplified version of it without much theoretical consideration in speech processing). cheers, David [1] A gradient algorithm locally equivalent to the EM algorithm, in Journal of the Royal Statistical Society. Series B. Methodological, 1995, vol. 57, n^o 2, pp. 425-437 [2] Online EM Algorithm for Latent Data Models, by: Olivier Cappe;, Eric Moulines, in the Journal of the Royal Statistical Society Series B (February 2009). ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] performance matrix multiplication vs. matlab
On Mon, Jun 08, 2009 at 09:02:12AM -0400, josef.p...@gmail.com wrote: whats the actual shape of the array/data you run your PCA on. 50 000 dimensions, 820 datapoints. Number of time periods, size of cross section at point in time? I am not sure what the question means. The data is sampled at 0.5Hz. Gaël ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] performance matrix multiplication vs. matlab
2009/6/8 Gael Varoquaux gael.varoqu...@normalesup.org: On Mon, Jun 08, 2009 at 09:02:12AM -0400, josef.p...@gmail.com wrote: whats the actual shape of the array/data you run your PCA on. 50 000 dimensions, 820 datapoints. You definitely can't expect to find 50 meaningfull PCs. It's impossible to robustly get them with less than thousand points! Number of time periods, size of cross section at point in time? I am not sure what the question means. The data is sampled at 0.5Hz. Gaël ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] performance matrix multiplication vs. matlab
On Mon, Jun 8, 2009 at 6:17 AM, Gael Varoquaux gael.varoqu...@normalesup.org wrote: On Mon, Jun 08, 2009 at 09:02:12AM -0400, josef.p...@gmail.com wrote: whats the actual shape of the array/data you run your PCA on. 50 000 dimensions, 820 datapoints. Have you tried shuffling each time series, performing PCA, looking at the magnitude of the largest eigenvalue, then repeating many times? That will give you an idea of how large the noise can be. Then you can see how many eigenvectors of the unshuffled data have eigenvalues greater than the noise. It would be kind of the empirical approach to random matrix theory. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] performance matrix multiplication vs. matlab
On Mon, Jun 08, 2009 at 06:28:06AM -0700, Keith Goodman wrote: On Mon, Jun 8, 2009 at 6:17 AM, Gael Varoquaux gael.varoqu...@normalesup.org wrote: On Mon, Jun 08, 2009 at 09:02:12AM -0400, josef.p...@gmail.com wrote: whats the actual shape of the array/data you run your PCA on. 50 000 dimensions, 820 datapoints. Have you tried shuffling each time series, performing PCA, looking at the magnitude of the largest eigenvalue, then repeating many times? That will give you an idea of how large the noise can be. Then you can see how many eigenvectors of the unshuffled data have eigenvalues greater than the noise. It would be kind of the empirical approach to random matrix theory. Yes, that's the kind of things that is done in the paper I pointed out and I use to infer the number of PCAs I retain. Gaël ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] extract elements of an array that are contained in another array?
Robert Cimrman wrote: Hi Josef, thanks for the summary! I am responding below, later I will make an enhancement ticket. Done, see http://projects.scipy.org/numpy/ticket/1133 r. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] setmember1d_nu
Robert Cimrman wrote: Hi Neil, Neil Crighton wrote: Hi all, I posted this message couple of days ago, but gmane grouped it with an old thread and it hasn't shown up on the front page. So here it is again... I'd really like to see the setmember1d_nu function in ticket 1036 get into numpy. There's a patch waiting for review that including tests: http://projects.scipy.org/numpy/ticket/1036 Is there anything I can do to help get it applied? I guess I could commit it, if you review the patch and it works for you. Obviously, I cannot review it myself, but my SVN access may still work :) Thanks for the review, it is in! r. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] scipy 0.7.1rc2 released
OK, I'm stuck with #946 with the MKL as well (finally managed to compile and use it with only the static library safe for libguide). I'm trying to download the trunk at the moment to check if the segmentation fault is still there. Matthieu 2009/6/8 Matthieu Brucher matthieu.bruc...@gmail.com: Good luck with fixing this then :| I've tried to build scipy with the MKL and ATLAS, and I have in both cases a segmentation fault. With the MKL, it is the same as in a previous mail, and for ATLAS it is there: Regression test for #946. ... Segmentation fault A bad ATLAS compilation? Matthieu It could be simply enhanced by refactoring only mconf.h with proper compiler flags, and fix yn.c to remove the NAN detection (as it should be in the mconf.h). NAN and co definition should be dealt with the portable definitions we have now in numpy - I just have to find a way to reuse the corresponding code outside numpy (distutils currently does not handle proper installation of libraries built through build_clib), it is on my TODO list for scipy 0.8. Unfortunately, this is only the tip of the iceberg. A lot of code in cephes uses #ifdef on platform specificities, and let's not forget it is pre-ANSI C code (KR declarations), with a lot of hidden bugs.\ cheers, David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] matrix default to column vector?
Is this lack of associativity really *always* such a huge issue? I can imagine many situations where it is not. One just want to compute A*B*C, without any particular knowing of whether A*(B*C) or (A*B)*C is best. If the user is allowed to blindly use A*B*C, I don't really see why he wouldn't be allowed to use dot(A,B,C) with the same convention... One should realize that allowing dot(A,B,C) is just *better* than the present situation where the user is forced into writing dot(dot(A,B),C) or dot(A,dot(B,C)). One does not remove any liberty from the user. He may always switch back to one of the above forms if he really knows which is best for him. So I fail to see exactly where the problem is... == Olivier 2009/6/7 Robert Kern robert.k...@gmail.com On Sun, Jun 7, 2009 at 04:44, Olivier Verdier zelb...@gmail.com wrote: Yes, I found the thread you are referring to: http://mail.python.org/pipermail/python-dev/2008-July/081554.html However, since A*B*C exists for matrices and actually computes (A*B)*C, why not do the same with dot? I.e. why not decide that dot(A,B,C) does what would A*B*C do, i.e., dot(dot(A,B),C)? The performance and precision problems are the responsability of the user, just as with the formula A*B*C. I'm happy to make the user responsible for performance and precision problems if he has the tools to handle them. The operator gives the user the easy ability to decide the precedence with parentheses. The function does not. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] scipy 0.7.1rc2 released
David, I've checked out the trunk, and the segmentation fault isn't there anymore (the trunk is labeled 0.8.0 though) Here is the log from the remaining errors with the MKL: == ERROR: Failure: ImportError (/data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/linalg/atlas_version.so: undefined symbol: ATL_buildinfo) -- Traceback (most recent call last): File /data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/nose/loader.py, line 364, in loadTestsFromName addr.filename, addr.module) File /data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/nose/importer.py, line 39, in importFromPath return self.importFromDir(dir_path, fqname) File /data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/nose/importer.py, line 84, in importFromDir mod = load_module(part_fqname, fh, filename, desc) File /data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/linalg/tests/test_atlas_version.py, line 8, in module import scipy.linalg.atlas_version ImportError: /data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/linalg/atlas_version.so: undefined symbol: ATL_buildinfo == ERROR: test_io.test_imread -- Traceback (most recent call last): File /data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/nose/case.py, line 182, in runTest self.test(*self.arg) File /data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/ndimage/tests/test_io.py, line 8, in test_imread img = ndi.imread(lp) AttributeError: 'module' object has no attribute 'imread' == ERROR: test_outer_v (test_lgmres.TestLGMRES) -- Traceback (most recent call last): File /data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/sparse/linalg/isolve/tests/test_lgmres.py, line 52, in test_outer_v x0, count_0 = do_solve(outer_k=6, outer_v=outer_v) File /data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/sparse/linalg/isolve/tests/test_lgmres.py, line 29, in do_solve x0, flag = lgmres(A, b, x0=zeros(A.shape[0]), inner_m=6, tol=1e-14, **kw) TypeError: 'module' object is not callable == ERROR: test_preconditioner (test_lgmres.TestLGMRES) -- Traceback (most recent call last): File /data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/sparse/linalg/isolve/tests/test_lgmres.py, line 41, in test_preconditioner x0, count_0 = do_solve() File /data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/sparse/linalg/isolve/tests/test_lgmres.py, line 29, in do_solve x0, flag = lgmres(A, b, x0=zeros(A.shape[0]), inner_m=6, tol=1e-14, **kw) TypeError: 'module' object is not callable == ERROR: test_iv_cephes_vs_amos (test_basic.TestBessel) -- Traceback (most recent call last): File /data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/special/tests/test_basic.py, line 1691, in test_iv_cephes_vs_amos self.check_cephes_vs_amos(iv, iv, rtol=1e-12, atol=1e-305) File /data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/special/tests/test_basic.py, line 1672, in check_cephes_vs_amos assert_tol_equal(c1, c2, err_msg=(v, z), rtol=rtol, atol=atol) File /data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/special/tests/test_basic.py, line 38, in assert_tol_equal verbose=verbose, header=header) File /data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/numpy/testing/utils.py, line 377, in assert_array_compare val = comparison(x[~xnanid], y[~ynanid]) IndexError: 0-d arrays can't be indexed == FAIL: test_lorentz (test_odr.TestODR) -- Traceback (most recent call last): File /data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/scipy/odr/tests/test_odr.py, line 292, in test_lorentz 3.7798193600109009e+00]), File /data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/numpy/testing/utils.py, line 537, in assert_array_almost_equal header='Arrays are not almost equal') File /data/pau112/INNO/local/x86_64/lib/python2.5/site-packages/numpy/testing/utils.py, line 395, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not almost equal (mismatch 100.0%) x: array([ 1.e+03, 1.e-01, 3.8000e+00])
Re: [Numpy-discussion] performance matrix multiplication vs. matlab
On Mon, Jun 8, 2009 at 8:55 AM, David Cournapeau da...@ar.media.kyoto-u.ac.jp wrote: I think it depends on what you are doing - EM is used for 'real' work too, after all :) Certainly, but EM is really just a mediocre gradient descent/hill climbing algorithm that is relatively easy to implement. Thanks for the link, I was not aware of this work. What is the difference between the ECG method and the method proposed by Lange in [1] ? To avoid 'local trapping' of the parameter in EM methods, recursive EM [2] may also be a promising method, also it seems to me that it has not been used so much, but I may well be wrong (I have seen several people using a simplified version of it without much theoretical consideration in speech processing). I hung-out in the machine learning community appx. 1999-2007 and thought the Salakhutdinov work was extremely refreshing to see after listening to no end of papers applying EM to whatever was the hot topic at the time. :) I've certainly seen/heard about various fixes to EM, but I haven't seen convincing reason(s) to prefer it over proper gradient descent/hill climbing algorithms (besides its present-ability and ease of implementation). Cheers, Jason -- Jason Rennie Research Scientist, ITA Software 617-714-2645 http://www.itasoftware.com/ ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] scipy 0.7.1rc2 released
Matthieu Brucher wrote: David, I've checked out the trunk, and the segmentation fault isn't there anymore (the trunk is labeled 0.8.0 though) Yes, the upcoming 0.7.1 release has its code in the 0.7.x svn branch. But the fix for #946 is a backport of 0.8.0, so in theory, it should be fixed :) Concerning the other errors: did you compile with intel compilers or GNU ones ? cheers, David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] scipy 0.7.1rc2 released
2009/6/8 David Cournapeau da...@ar.media.kyoto-u.ac.jp: Matthieu Brucher wrote: David, I've checked out the trunk, and the segmentation fault isn't there anymore (the trunk is labeled 0.8.0 though) Yes, the upcoming 0.7.1 release has its code in the 0.7.x svn branch. But the fix for #946 is a backport of 0.8.0, so in theory, it should be fixed :) OK, I didn't check the branches, I should have :| Concerning the other errors: did you compile with intel compilers or GNU ones ? Only Intel compilers. Maybe I should check the rc branch instead of the trunk? Matthieu -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] performance matrix multiplication vs. matlab
Jason Rennie wrote: I hung-out in the machine learning community appx. 1999-2007 and thought the Salakhutdinov work was extremely refreshing to see after listening to no end of papers applying EM to whatever was the hot topic at the time. :) Isn't it true for any general framework who enjoys some popularity :) I've certainly seen/heard about various fixes to EM, but I haven't seen convincing reason(s) to prefer it over proper gradient descent/hill climbing algorithms (besides its present-ability and ease of implementation). I think there are cases where gradient methods are not applicable (latent models where the complete data Y cannot be split into observations-hidden (O, H) variables), although I am not sure that's a very common case in machine learning, cheers, David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] scipy 0.7.1rc2 released
Matthieu Brucher wrote: Concerning the other errors: did you compile with intel compilers or GNU ones ? Only Intel compilers. Maybe I should check the rc branch instead of the trunk? I just wanted to confirm - I am actually rather surprised there are not more errors :) cheers, David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots
Hi All, I'm new to python and tools like matplotlib and Mayavi so I may be missing something basic. I've been looking for a fairly lightweight editor/interactive shell combo that allows me to create plots and figures from a shell and play with them and kill them gracefully. The Mayavi documentation describes using IPython with the -wthread option at startup and while this works well I really would like to use an environment where I can see the variables. I really like the PyDee layout and it has everything I need (editor view, shell, workspace, doc view) but I don't think it can be used with IPython (yet). Does anyone have any suggestions? In Pydee I can generate plots and update them from the shell but I can't (or don't know how to) kill them. Is there an alternative with an IPython shell that has a built-in editor view or at least a workspace where I can view variables? I hope this is an acceptable place to post this. If not please let me know if you know a better place to ask. Cheers, Jonno. -- If a theory can't produce hypotheses, can't be tested, can't be disproven, and can't make predictions, then it's not a theory and certainly not science. by spisska on Slashdot, Monday April 21, 2008 ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots
On Mon, Jun 8, 2009 at 12:29 PM, Jonno jonnojohn...@gmail.com wrote: Hi All, I'm new to python and tools like matplotlib and Mayavi so I may be missing something basic. I've been looking for a fairly lightweight editor/interactive shell combo that allows me to create plots and figures from a shell and play with them and kill them gracefully. The Mayavi documentation describes using IPython with the -wthread option at startup and while this works well I really would like to use an environment where I can see the variables. I really like the PyDee layout and it has everything I need (editor view, shell, workspace, doc view) but I don't think it can be used with IPython (yet). Does anyone have any suggestions? In Pydee I can generate plots and update them from the shell but I can't (or don't know how to) kill them. I'm using now pydee as my main shell to try out new scripts and I don't have any problems with the plots. I'm creating plots the standard way from matplotlib import pyplot as plt plt.plot(x,y) and I can close the poping up plot windows. if I have too many plot windows, I use plt.close(all) and it works without problems. Sometimes the windows are a bit slow in responding, and I need to use plt.show() more often than in a regular script. I especially like the source view doc window in pydee, and the select lines and execute, and ... Josef Is there an alternative with an IPython shell that has a built-in editor view or at least a workspace where I can view variables? I hope this is an acceptable place to post this. If not please let me know if you know a better place to ask. Cheers, Jonno. -- If a theory can't produce hypotheses, can't be tested, can't be disproven, and can't make predictions, then it's not a theory and certainly not science. by spisska on Slashdot, Monday April 21, 2008 ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] matrix default to column vector?
Olivier Verdier wrote: One should realize that allowing dot(A,B,C) is just *better* than the present situation where the user is forced into writing dot(dot(A,B),C) or dot(A,dot(B,C)). I'm lost now -- how is this better in any significant way? Tom K. wrote: But, almost all experienced users drift away from matrix toward array as they find the matrix class too limiting or strange That's one reason, and the other is that when you are doing real work, it is very rare for the linear algebra portion to be significant. I know in my code (and this was true when I was using MATLAB too), I may have 100 lines of code, and one of them is a linear algebra expression that could be expressed nicely with matrices and infix operators. Given that the rest of the code is more natural with nd-arrays, why the heck would I want to use matrices? this drove me crazy with MATLAB -- I hated the default matrix operators, I was always typing .*, etc. - it seems only applicable for new users and pedagogical purposes. and I'd take the new users of this list -- it serves no one to teach people something first, then tell them to abandon it. Which leaves the pedagogical purposes. In that case, you really need operators, slightly cleaner syntax that isn't infix really doesn't solve the pedagogical function. It seems there is a small but significant group of folks on this list that want matrices for that reason. That group needs to settle on a solution, and then implement it. Personally, I think the row-vector, column-vector approach is the way to go -- even though, yes, these are matrices that happen to have one dimension or the other set to one, I know that when I was learning LA (and still), I thought about row and column vectors a fair bit, and indexing them in a simple way would be nice. But I don't teach, so I'll stop there. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots
On Mon, Jun 8, 2009 at 11:39 AM, josef.p...@gmail.com wrote: I'm using now pydee as my main shell to try out new scripts and I don't have any problems with the plots. I'm creating plots the standard way from matplotlib import pyplot as plt plt.plot(x,y) and I can close the poping up plot windows. if I have too many plot windows, I use plt.close(all) and it works without problems. Sometimes the windows are a bit slow in responding, and I need to use plt.show() more often than in a regular script. I especially like the source view doc window in pydee, and the select lines and execute, and ... Josef Thanks Josef, I shouldn't have included Matplotlib since Pydee does work well with its plots. I had forgotten that. It really is just the Mayavi plots (or scenes I guess) that don't play well. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots
On Mon, Jun 8, 2009 at 11:35 AM, Gökhan SEVERgokhanse...@gmail.com wrote: Hello, To me, IPython is the right way to follow. Try whos to see what's in your namespace. You may want see this instructional video (A Demonstration of the 'IPython' Interactive Shell) to learn more about IPython's functionality or you can delve in its documentation. There are IPython integrations plans for pydee. You can see the details on pydee's google page. Gökhan Thanks Gokhan, I didn't know about whos so thanks for the tip. What about a lightweight editor with an integrated IPython shell then? I also found PyScripter which looks pretty nice too but also has the same lack of IPython shell. ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] matrix default to column vector?
2009/6/8 Christopher Barker chris.bar...@noaa.gov Olivier Verdier wrote: One should realize that allowing dot(A,B,C) is just *better* than the present situation where the user is forced into writing dot(dot(A,B),C) or dot(A,dot(B,C)). I'm lost now -- how is this better in any significant way? Well, allowing dot(A,B,C) does not remove any other possibility does it? That is what I meant by better. It just gives the user an extra possibility. What would be wrong with that? Especially since matrix users already can write A*B*C. I won't fight for this though. I personally don't care but I think that it would remove the last argument for matrices against arrays, namely the fact that A*B*C is easier to write than dot(dot(A,B),C). I don't understand why it would be a bad idea to implement this dot(A,B,C). Tom K. wrote: But, almost all experienced users drift away from matrix toward array as they find the matrix class too limiting or strange That's one reason, and the other is that when you are doing real work, it is very rare for the linear algebra portion to be significant. I know in my code (and this was true when I was using MATLAB too), I may have 100 lines of code, and one of them is a linear algebra expression that could be expressed nicely with matrices and infix operators. Given that the rest of the code is more natural with nd-arrays, why the heck would I want to use matrices? this drove me crazy with MATLAB -- I hated the default matrix operators, I was always typing .*, etc. This exactly agrees with my experience too. == Olivier ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots
On Mon, Jun 8, 2009 at 12:11 PM, Jonno jonnojohn...@gmail.com wrote: On Mon, Jun 8, 2009 at 11:35 AM, Gökhan SEVERgokhanse...@gmail.com wrote: Hello, To me, IPython is the right way to follow. Try whos to see what's in your namespace. You may want see this instructional video (A Demonstration of the 'IPython' Interactive Shell) to learn more about IPython's functionality or you can delve in its documentation. There are IPython integrations plans for pydee. You can see the details on pydee's google page. Gökhan Thanks Gokhan, I didn't know about whos so thanks for the tip. What about a lightweight editor with an integrated IPython shell then? I also found PyScripter which looks pretty nice too but also has the same lack of IPython shell. I use scite as my main text editor. It highlights Python syntax nicely, and has code-completion support. Well not as powerful as Eclipse-PyDev pair but it works :) And yes PyDev doesn't have IPython integration either. Eclipse-PyDev is also slow to me, (loading takes lots of time :)) and shell integration not as easy as in IPy. I am looking forward to pydee developer's to bring IPython functionality into their development environment. Besides, PyScripter, there is also Eric4 as a free IDE for Python, but again no IPython. So far, IPython-Scite is the fastest that I can build my programs. Experiment in Ipython and build pieces in Scite. I would like to know what others use in this respect? ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots
On Jun 8, 2009, at 12:26 PM, Gökhan SEVER wrote: On Mon, Jun 8, 2009 at 12:11 PM, Jonno jonnojohn...@gmail.com wrote: On Mon, Jun 8, 2009 at 11:35 AM, Gökhan SEVERgokhanse...@gmail.com wrote: Hello, To me, IPython is the right way to follow. Try whos to see what's in your namespace. You may want see this instructional video (A Demonstration of the 'IPython' Interactive Shell) to learn more about IPython's functionality or you can delve in its documentation. There are IPython integrations plans for pydee. You can see the details on pydee's google page. Gökhan Thanks Gokhan, I didn't know about whos so thanks for the tip. What about a lightweight editor with an integrated IPython shell then? I also found PyScripter which looks pretty nice too but also has the same lack of IPython shell. I use scite as my main text editor. It highlights Python syntax nicely, and has code-completion support. Well not as powerful as Eclipse-PyDev pair but it works :) And yes PyDev doesn't have IPython integration either. Eclipse-PyDev is also slow to me, (loading takes lots of time :)) and shell integration not as easy as in IPy. I am looking forward to pydee developer's to bring IPython functionality into their development environment. Besides, PyScripter, there is also Eric4 as a free IDE for Python, but again no IPython. So far, IPython-Scite is the fastest that I can build my programs. Experiment in Ipython and build pieces in Scite. I would like to know what others use in this respect? You might take a look at EPDLab as well. Thanks to Gael Varoquaux, It integrates IPython into an Envisage application and has a crude name-space browser EPDLab is part of the Enthought Tool Suite and is an open-source application (BSD-style license). It's another example (like Mayavi) of using the Enthought Tool Suite to build applications. Don't be confused because the binary distribution called EPD is only free for academic use EPDLab is completely open source. You can check out the source code here: https://svn.enthought.com/svn/enthought/EPDLab/trunk It requires quite a bit of ETS to run first though.If you have EPD installed, then EPDLab is already available to you. It's still alpha, so I hesitate to advertise it. But, it's easy to extend as you would like so I thought I would chime in on this discussion. Best regards, -Travis ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- Travis Oliphant Enthought Inc. 1-512-536-1057 http://www.enthought.com oliph...@enthought.com ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] matrix default to column vector?
Olivier Verdier wrote: Well, allowing dot(A,B,C) does not remove any other possibility does it? I won't fight for this though. I personally don't care but I think that it would remove the last argument for matrices against arrays, namely the fact that A*B*C is easier to write than dot(dot(A,B),C). Well, no. Notation matters to students. Additionally, matrix exponentiation is useful. E.g., A**(N-1) finds the transitive closure of the binary relation represented by the NxN boolean matrix A. Alan Isaac ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] matrix default to column vector?
Going back to Alan Isaac's example: 1) beta = (X.T*X).I * X.T * Y 2) beta = np.dot(np.dot(la.inv(np.dot(X.T,X)),X.T),Y) Robert Kern wrote: 4) beta = la.lstsq(X, Y)[0] I really hate that example. Remember, the example is a **teaching** example. I actually use NumPy in a Master's level math econ course (among other places). As it happens, I do get around to explaining why using an explicit inverse is a bad idea numerically, but that is entirely an aside in a course that is not concerned with numerical methods. It is concerned only with mastering a few basic math tools, and being able to implement some of them in code is largely a check on understanding and precision (and to provide basic background for future applications). Having them use lstsq is counterproductive for the material being covered, at least initially. A typical course of this type uses Excel or includes no applications at all. So please, show a little gratitude. ;-) Alan Isaac ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Fwd: Re: is my numpy installation using custom blas/lapack?
Changing the site.cfg as you suggested did the trick! For what its worth, setup.py build no longer fails as before at compilation step (line 95), (I'm still puzzled whether this earlier 'failure' was caused by some error in my build process but I should probably let it go.) and numpy.show_config() now shows ATLAS info under blas_opt_info: blas_opt_info: libraries = ['ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/usr/local/rich/src/scipy_build/lib'] define_macros = [('ATLAS_INFO', '\\3.8.3\\')] language = c I guess the short answer for whether non-threaded ATLAS libraries are being used (after being found) by a numpy installation is that there is no short answer. Thanks Chris for your patience help! Numpy is a great resource. Rich ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] matrix default to column vector?
On Mon, Jun 8, 2009 at 14:10, Alan G Isaacais...@american.edu wrote: Going back to Alan Isaac's example: 1) beta = (X.T*X).I * X.T * Y 2) beta = np.dot(np.dot(la.inv(np.dot(X.T,X)),X.T),Y) Robert Kern wrote: 4) beta = la.lstsq(X, Y)[0] I really hate that example. Remember, the example is a **teaching** example. I know. Honestly, I would prefer that teachers skip over the normal equations entirely and move directly to decomposition approaches. If you are going to make them implement least-squares from more basic tools, I think it's more enlightening as a student to start with the SVD than the normal equations. I actually use NumPy in a Master's level math econ course (among other places). As it happens, I do get around to explaining why using an explicit inverse is a bad idea numerically, but that is entirely an aside in a course that is not concerned with numerical methods. It is concerned only with mastering a few basic math tools, and being able to implement some of them in code is largely a check on understanding and precision (and to provide basic background for future applications). Having them use lstsq is counterproductive for the material being covered, at least initially. A typical course of this type uses Excel or includes no applications at all. So please, show a little gratitude. ;-) If it's not a class where they are going to use what they learn in the future to write numerical programs, I really don't care whether you teach it with numpy or not. If it *is* such a class, then I would prefer that the students get taught the right way to write numerical programs. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] New datetime dtypes
Hello, I'm working on the new datetime64 and timedelta64 dtypes (as proposed here: http://projects.scipy.org/numpy/browser/trunk/doc/neps/datetime-proposal3.rst). I'm looking through the C code in numpy core, and can't seem to find much in the way of dtypes. Pierre suggested looking through the multiarraymodule file. Where can I find some reference code on the other dtypes? -Marty Fuhry ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] performance matrix multiplication vs. matlab
On 8-Jun-09, at 8:33 AM, Jason Rennie wrote:Note that EM can be very slow to converge:That's absolutely true, but EM for PCA can be a life saver in cases where diagonalizing (or even computing) the full covariance matrix is not a realistic option. Diagonalization can be a lot of wasted effort if all you care about are a few leading eigenvectors. EM also lets you deal with missing values in a principled way, which I don't think you can do with standard SVD.EM certainly isn't a magic bullet but there are circumstances where it's appropriate. I'm a big fan of the ECG paper too. :)David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] matrix default to column vector?
On Mon, Jun 8, 2009 at 3:33 PM, Robert Kernrobert.k...@gmail.com wrote: On Mon, Jun 8, 2009 at 14:10, Alan G Isaacais...@american.edu wrote: Going back to Alan Isaac's example: 1) beta = (X.T*X).I * X.T * Y 2) beta = np.dot(np.dot(la.inv(np.dot(X.T,X)),X.T),Y) Robert Kern wrote: 4) beta = la.lstsq(X, Y)[0] I really hate that example. Remember, the example is a **teaching** example. I know. Honestly, I would prefer that teachers skip over the normal equations entirely and move directly to decomposition approaches. If you are going to make them implement least-squares from more basic tools, I think it's more enlightening as a student to start with the SVD than the normal equations. I actually use NumPy in a Master's level math econ course (among other places). As it happens, I do get around to explaining why using an explicit inverse is a bad idea numerically, but that is entirely an aside in a course that is not concerned with numerical methods. It is concerned only with mastering a few basic math tools, and being able to implement some of them in code is largely a check on understanding and precision (and to provide basic background for future applications). Having them use lstsq is counterproductive for the material being covered, at least initially. A typical course of this type uses Excel or includes no applications at all. So please, show a little gratitude. ;-) If it's not a class where they are going to use what they learn in the future to write numerical programs, I really don't care whether you teach it with numpy or not. If it *is* such a class, then I would prefer that the students get taught the right way to write numerical programs. I started in such a class (with Dr. Isaac as a matter of fact). I found the use of Python with Numpy to be very enlightening for the basic concepts of linear algebra. I appreciated the simple syntax of matrices at the time as a gentler learning curve since my background in programming was mainly at a hobbyist level. I then went on to take a few econometrics courses where we learned the normal equations. Now a few years later I am working on scipy.stats as a google summer of code project, and I am learning why a SVD decomposition is much more efficient (an economist never necessarily *needs* to know what's under the hood of their stats package). The intuition for the numerical methods was in place, as well as the basic familiarity with numpy/scipy. So I would not discount this approach too much. People get what they want out of anything, and I was happy to learn about Python and Numpy/Scipy as alternatives to proprietary packages. And I hope my work this summer can contribute even a little to making the project an accessible alternative for researchers without a strong technical background. Skipper ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] The SciPy Doc Marathon continues
Let's Finish Documenting SciPy! Last year, we began the SciPy Documentation Marathon to write reference pages (docstrings) for NumPy and SciPy. It was a huge job, bigger than we first imagined, with NumPy alone having over 2,000 functions. We created the doc wiki (now at docs.scipy.org), where you write, review, and proofread docs that then get integrated into the source code. In September, we had over 55% of NumPy in the first draft stage, and about 25% to the needs review stage. The PDF NumPy Reference Guide was over 300 pages, nicely formatted by ReST, which makes an HTML version as well. The PDF document now has over 500 pages, with the addition of sections from Travis Oliphant's book Guide to NumPy. That's an amazing amount of work, possible through the contributions of over 30 volunteers. It came back to us as the vastly-expanded help pages in NumPy 1.2, released last September. With your help, WE CAN FINISH! This summer we can: - Write all the important NumPy pages to the Needs Review stage - Start documenting the SciPy package - Get the SciPy User Manual started - Implement dual review - technical and presentation - on the doc wiki - Get NumPy docs and packaging on a sound financial footing We'll start with the first two. UCF has hired David Goldsmith to lead this summer's doc effort. David will write a lot of docs himself, but more importantly, he will organize our efforts toward completing doc milestones. There will be rewards, T-shirts, and likely other fun stuff for those who contribute the most. David will start the ball rolling shortly. This is a big vision, and it will require YOUR help to make it happen! The main need now is for people to work on the reference pages. Here's how: 1. Go to http://docs.scipy.org/NumPy 2. Read the intro and doc standards, and some docstrings on the wiki 3. Make an account 4. Ask the scipy-...@scipy.org email list for editor access 5. EDIT! All doc discussions (except announcements like this one) should happen on the scipy-...@scipy.org email list. You can browse the archives and sign up for the list at http://scipy.org/Mailing_Lists . That's where we will announce sprints on topic areas and so on. We'll also meet online every week, Wednesdays at 4:30pm US Eastern Time, on Skype. David will give the details. Welcome back to the Marathon! --jh-- Prof. Joseph Harrington Planetary Sciences Group Department of Physics MAP 414 4000 Central Florida Blvd. University of Central Florida Orlando, FL 32816-2385 j...@physics.ucf.edu planets.ucf.edu ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots
Gökhan SEVER wrote: So far, IPython-Scite is the fastest that I can build my programs. Experiment in Ipython and build pieces in Scite. I would like to know what others use in this respect? Peppy (http://peppy.flipturn.org/) + iPython It would be nice to have those two integrated, though it's really not hard to switch between them. iPython's run is wonderful. A note about Peppy: It's pretty new, not widely used, heavyweight and not feature complete. However, it does a fews things right (by my personal definition of right ;-) ) that I haven't seen in any other editor: * Modern GUI (i.e. not Emacs or vim, which probably get everything else right...) * Scripted/written in Python (the other reason not to use Emacs/vim) * designed to be general purpose, not primarily python * multiple top-level windows, and the ability to edit the same file in multiple windows at once. * Python (and other languages) indenting done right (i.e. like Emacs Python mode) * Fully cross platform (Windows, Mac, *nix (GTK) ) * all it's other features are pretty common... Major missing feature: code completion -- I'm really starting to like that in iPython... It's been my primary editor for year or so, and hasn't destroyed any data yet! I'd love to see it get wider use. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] matrix default to column vector?
2009/6/8 Robert Kern robert.k...@gmail.com: Remember, the example is a **teaching** example. I know. Honestly, I would prefer that teachers skip over the normal equations entirely and move directly to decomposition approaches. If you are going to make them implement least-squares from more basic tools, I think it's more enlightening as a student to start with the SVD than the normal equations. I agree, and I wish our cirriculum followed that route. In linear algebra, I also don't much like the way eigenvalues are taught, where students have to solve characteristic polynomials by hand. When I teach the subject again, I'll pay more attention to these books: Numerical linear algebra by Lloyd Trefethen http://books.google.co.za/books?id=bj-Lu6zjWbEC (e.g. has SVD in Lecture 4) Applied Numerical Linear Algebra by James Demmel http://books.google.co.za/books?id=lr8cFi-YWnIC (e.g. has perturbation theory on page 4) Regards Stéfan ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] matrix default to column vector?
2009/6/8 Stéfan van der Walt ste...@sun.ac.za: 2009/6/8 Robert Kern robert.k...@gmail.com: Remember, the example is a **teaching** example. I know. Honestly, I would prefer that teachers skip over the normal equations entirely and move directly to decomposition approaches. If you are going to make them implement least-squares from more basic tools, I think it's more enlightening as a student to start with the SVD than the normal equations. I agree, and I wish our cirriculum followed that route. In linear algebra, I also don't much like the way eigenvalues are taught, where students have to solve characteristic polynomials by hand. When I teach the subject again, I'll pay more attention to these books: Numerical linear algebra by Lloyd Trefethen http://books.google.co.za/books?id=bj-Lu6zjWbEC (e.g. has SVD in Lecture 4) Applied Numerical Linear Algebra by James Demmel http://books.google.co.za/books?id=lr8cFi-YWnIC (e.g. has perturbation theory on page 4) Regards Stéfan Ok, I also have to give my 2 cents Any basic econometrics textbook warns of multicollinearity. Since, economists are mostly interested in the parameter estimates, the covariance matrix needs to have little multicollinearity, otherwise the standard errors of the parameters will be huge. If I use automatically pinv or lstsq, then, unless I look at the condition number and singularities, I get estimates that look pretty nice, even they have an arbitrary choice of the indeterminacy. So in economics, I never worried too much about the numerical precision of the inverse, because, if the correlation matrix is close to singular, the model is misspecified, or needs reparameterization or the data is useless for the question. Compared to endogeneity bias for example, or homoscedasticy assumptions and so on, the numerical problem is pretty small. This doesn't mean matrix decomposition methods are not useful for numerical calculations and efficiency, but I don't think the numerical problem deserves a lot of emphasis in a basic econometrics class. Josef ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots
Travis Oliphant wrote: You might take a look at EPDLab as well. Thanks to Gael Varoquaux, It integrates IPython into an Envisage application and has a crude name-space browser I was wondering when you guys would get around to making one of those. Nice start, the iPython shell is nice, though the editor needs a lot of features -- I wonder if you could integrate an existing wxPython editor: Editra Peppy SPE PyPE ... And get full featured editor that way. Winpdb would be nice, too -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] matrix default to column vector?
On Mon, Jun 8, 2009 at 15:21, josef.p...@gmail.com wrote: 2009/6/8 Stéfan van der Walt ste...@sun.ac.za: 2009/6/8 Robert Kern robert.k...@gmail.com: Remember, the example is a **teaching** example. I know. Honestly, I would prefer that teachers skip over the normal equations entirely and move directly to decomposition approaches. If you are going to make them implement least-squares from more basic tools, I think it's more enlightening as a student to start with the SVD than the normal equations. I agree, and I wish our cirriculum followed that route. In linear algebra, I also don't much like the way eigenvalues are taught, where students have to solve characteristic polynomials by hand. When I teach the subject again, I'll pay more attention to these books: Numerical linear algebra by Lloyd Trefethen http://books.google.co.za/books?id=bj-Lu6zjWbEC (e.g. has SVD in Lecture 4) Applied Numerical Linear Algebra by James Demmel http://books.google.co.za/books?id=lr8cFi-YWnIC (e.g. has perturbation theory on page 4) Regards Stéfan Ok, I also have to give my 2 cents Any basic econometrics textbook warns of multicollinearity. Since, economists are mostly interested in the parameter estimates, the covariance matrix needs to have little multicollinearity, otherwise the standard errors of the parameters will be huge. If I use automatically pinv or lstsq, then, unless I look at the condition number and singularities, I get estimates that look pretty nice, even they have an arbitrary choice of the indeterminacy. So in economics, I never worried too much about the numerical precision of the inverse, because, if the correlation matrix is close to singular, the model is misspecified, or needs reparameterization or the data is useless for the question. Compared to endogeneity bias for example, or homoscedasticy assumptions and so on, the numerical problem is pretty small. This doesn't mean matrix decomposition methods are not useful for numerical calculations and efficiency, but I don't think the numerical problem deserves a lot of emphasis in a basic econometrics class. Actually, my point is a bit broader. Numerics aside, if you are going to bother peeking under the hood of least-squares at all, I think the student gets a better understanding of least-squares via one of the decomposition methods rather than the normal equations. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots
On Mon, Jun 8, 2009 at 15:34, Christopher Barkerchris.bar...@noaa.gov wrote: Travis Oliphant wrote: You might take a look at EPDLab as well. Thanks to Gael Varoquaux, It integrates IPython into an Envisage application and has a crude name-space browser I was wondering when you guys would get around to making one of those. Nice start, the iPython shell is nice, though the editor needs a lot of features -- I wonder if you could integrate an existing wxPython editor: Editra Peppy SPE PyPE ... And get full featured editor that way. That's what this part is for: https://svn.enthought.com/svn/enthought/EPDLab/trunk/enthought/epdlab/remote_editor/ -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots
On Mon, Jun 8, 2009 at 15:34, Robert Kernrobert.k...@gmail.com wrote: On Mon, Jun 8, 2009 at 15:34, Christopher Barkerchris.bar...@noaa.gov wrote: Travis Oliphant wrote: You might take a look at EPDLab as well. Thanks to Gael Varoquaux, It integrates IPython into an Envisage application and has a crude name-space browser I was wondering when you guys would get around to making one of those. Nice start, the iPython shell is nice, though the editor needs a lot of features -- I wonder if you could integrate an existing wxPython editor: Editra Peppy SPE PyPE ... And get full featured editor that way. That's what this part is for: https://svn.enthought.com/svn/enthought/EPDLab/trunk/enthought/epdlab/remote_editor/ More accurately, these: https://svn.enthought.com/svn/enthought/EnvisagePlugins/trunk/enthought/plugins/remote_editor/editor_plugins/ -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots
On 8-Jun-09, at 12:58 PM, Jonno wrote: Thanks Josef, I shouldn't have included Matplotlib since Pydee does work well with its plots. I had forgotten that. It really is just the Mayavi plots (or scenes I guess) that don't play well. I don't know how exactly matplotlib integration issues are handled in Pydee, but I do know that it's all in Qt. You should be able to set the environment variable ETS_TOOLKIT='qt4' to make Mayavi use the (somewhat neglected but still functional, AFAIK) Qt backend. Wx and Qt event loops competing might be the problem. David ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots
On Mon, Jun 08, 2009 at 05:19:11PM -0400, David Warde-Farley wrote: On 8-Jun-09, at 12:58 PM, Jonno wrote: Thanks Josef, I shouldn't have included Matplotlib since Pydee does work well with its plots. I had forgotten that. It really is just the Mayavi plots (or scenes I guess) that don't play well. I don't know how exactly matplotlib integration issues are handled in Pydee, but I do know that it's all in Qt. You should be able to set the environment variable ETS_TOOLKIT='qt4' to make Mayavi use the (somewhat neglected but still functional, AFAIK) Qt backend. Wx and Qt event loops competing might be the problem. Correct. And as you point out the Qt backend of Mayavi is less functionnal, because (due to licensing reason) there is less economic pressure on making it work. Gaël ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots
On Mon, Jun 08, 2009 at 12:54:25PM -0500, Travis Oliphant wrote: You might take a look at EPDLab as well. Thanks to Gael Varoquaux, It integrates IPython into an Envisage application and has a crude name-space browser And it integrates with editra to have an editor where you can select code, and run it in EPDLab, in addition to running the whole file. And we claim that having the editor as a separate process is a feature: first we can use a good editor (actually anyone, provided you write a plugin that sends the code to be executed to EPDLab via sockets), second: if your execution environement crashes (and yes, this can happen, especially if you are running custom C code binded in Python), you don't loose your editor. Getting things right with the IPython shell was a bastard, and there is still work to be done (although the latest release of IPython introduces continuation lines ! Yay). On the other hand, you can add a lot of features based on the existing code. And all the components are reusable components, which means that you can build your own application with them. I really cannot work on EPDLab anymore, I don't have time (thanks a lot to Enthought for financing my work on such a thrilling project). I do maintainance on the IPython wx frontend, because I am the person who knows the code best (unfortunately), althought Laurent Dufrechou is helping out. However, I strongly encourage people to contribute to EPDLab. It is open source, in an open SVN. You can get check in rights if you show that your contributions are of quality. Enthought is a company, and has its own agenda. It needs to sell products to consummers, so it might not be interesting in investing time where you might (althought Enthought has proven more than once that they can invest time on long-term projects, just because they believe they are good for the future of scientific computing in Python). On the other hand, if you are willing to devote time to add what you think lacks in EPDLab (whatever it might be), _you_ can make a difference. I believe in the future of EPDLab because it is based on very powerful components, like IPython, Traits, matplotlib, Wx, Mayavi, Chaco. I believe that choosing to work with a power stack like this has an upfront cost: you need to make sure everything fits together. Last year I sent micro-patches to matplotlib to make sure event-loops where correctly detected. I spent more than a month working in the IPython code base. The enthought guys fixed Traits bugs. This is costly and takes time. However, this can get you far, very far. All right, back to catching up with life :) Ga�l ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Definitions of pv, fv, nper, pmt, and rate
Hi, folks. Unable to find a printed reference for the definition we use to compute the functions in the Subject line of this email, I posted a couple queries for help in this regard in the Discussion for fv (http://docs.scipy.org/numpy/docs/numpy.lib.financial.fv/#discussion-sec). josef Pktd's reply (thanks!) just makes me even more doubtful that we're using the definition that most users from the financial community would be expecting. At this point, I have to say, I'm very concerned that our implementation for these is wrong (or at least inconsistent with what's used in financial circles); if you know of a reference - less ephemeral than a solely electronic document - defining these functions as we've implemented them, please share. Thanks! David Goldsmith PS: Some of the financial functions' help doc says they're unimplemented - are there plans to implement them, and if not, why do we have help doc for them? ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots
Gael Varoquaux wrote: Click in the menu: 'new file in remote browser', or something like this. If you have editra installed, it will launch it, with a special plugin allowing you to execute selected code in EPDLab. very cool, thanks! -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/ORR(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots
On Mon, Jun 08, 2009 at 01:34:39PM -0700, Christopher Barker wrote: Travis Oliphant wrote: You might take a look at EPDLab as well. Thanks to Gael Varoquaux, It integrates IPython into an Envisage application and has a crude name-space browser I was wondering when you guys would get around to making one of those. Nice start, the iPython shell is nice, though the editor needs a lot of features -- I wonder if you could integrate an existing wxPython editor: Editra Click in the menu: 'new file in remote browser', or something like this. If you have editra installed, it will launch it, with a special plugin allowing you to execute selected code in EPDLab. ;) Gaël ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] How to remove fortran-like loops with numpy?
Hi all, I'm new in numpy. Actually, I'm new in Python. In order to learn a bit, I want to create a program to plot the Mandelbrot set. This program is quite simple, and I have already programmed it. The problem is that I come from fortran, so I use to think in for loops. I know that it is not the best way to use Python and in fact the performance of the program is more than poor. Here is the program: *#!/usr/bin/python* *import numpy as np* *import matplotlib.pyplot as plt* *# Some parameters* *Xmin=-1.5* *Xmax=0.5* *Ymin=-1* *Ymax=1* *Ds = 0.01* *# Initialization of varibles* *X = np.arange(Xmin,Xmax,Ds)* *Y = np.arange(Ymax,Ymin,-Ds)* *N = np.zeros((X.shape[0],Y.shape[0]),'f')* *## Here are inefficient the calculations * *for i in range(X.shape[0]):* * for j in range(Y.shape[0]):* *z= complex(0.0, 0.0)* *c = complex(X[i], Y[j])* *while N[i, j] 30 and abs(z) 2:* * N[i, j] += 1* * z = z**2 + c* *if N[i, j] == 29:* * N[i, j]=0* *# And now, just for ploting...* *N = N.transpose()* *fig = plt.figure()* *plt.imshow(N,cmap=plt.cm.Blues)* *plt.title('Mandelbrot set')* *plt.xticks([]); plt.yticks([])* *plt.show()* *fig.savefig('test.png')* As you can see, it is very simple, but it takes several seconds running just to create a 200x200 plot. Fortran takes same time to create a 2000x2000 plot, around 100 times faster... So the question is, do you know how to programme this in a Python-like fashion in order to improve seriously the performance? Thanks in advance -- Juan José Gómez Navarro Edificio CIOyN, Campus de Espinardo, 30100 Departamento de Física Universidad de Murcia Tfno. (+34) 968 398552 Email: juanjo.gomeznava...@gmail.com Web: http://ciclon.inf.um.es/Inicio.html ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Summer NumPy Doc Marathon (Reply-to: scipy-...@scipy.org)
Dear SciPy Community Members: Hi! My name is David Goldsmith. I've been hired for the summer by Joe Harrington to further progress on NumPy documentation and ultimately, pending funding, SciPy documentation. Joe and I are reviving last summer’s enthusiasm in the community for this mission and enlisting as many of you as possible in the effort. On that note, please peruse the NumPy Doc Wiki (http://docs.scipy.org/numpy/Front Page/) and, in particular, the master list of functions/objects (“items”) needing work (http://docs.scipy.org/numpy/Milestones/). Our goal is to have every item to the ready-for-first-review stage (or better) by August 18 (i.e., the start of SciPyCon09). To accomplish this, we're forming teams to attack each doc category on the Milestones page. From the Milestones page: To speed things up, get more uniformity in the docs, and add a social element, we're attacking these categories as teams. A team lead takes responsibility for getting a category to Needs review within one month [we expect that some categories will require less time – please furnish your most ‘optimistically realistic’ deadline when ‘claiming’ a category], but no later than 18 August 2009. As leader, you commit to working with anyone who signs up in your category, and vice versa. The scipy-dev mailing list is a great place to recruit helpers. Major doc contributors will be listed in NumPy's contributors file, THANKS.txt. Anyone writing more than 1000 words will get a T-shirt (while supplies last, etc.). Teams that reach their goals in time will get special mention in THANKS.txt. Of course, you don't have to join a team. If you'd like to work on your own, please choose docstrings from an unclaimed category, and put your name after docstrings you are editing in the list below. If someone later claims that category, please coordinate with them or finish up your current docstrings and move to another category. Please note that, to edit anything on the Wiki (including the doc itself), you’ll need “edit rights” – how you get these is Item 5 under “Before you start” on the “Front Page,” but for your convenience, I’ll quote that here: Register a username on [docs.scipy.org]. Send an e-mail with your username to the scipy-dev mailing list (requires subscribing to the mailing list first, [which can be done at http://mail.scipy.org/mailman/listinfo/scipy-dev]), so that we can give you edit rights. If you are not subscribed to the mailing-list, you can also send an email to gael dot varoquaux at normalesup dot org, but this will take longer [and you’ll want to subscribe to scipy-dev anyway, because that’s the place to post questions and comments about this whole doc development project]. Also, I’ll be holding a weekly Skype (www.skype.com) telecon – Wednesdays at 4:30pm US Eastern Daylight Time - to review progress and discuss any roadblocks we may have encountered (or anticipate encountering). If you’d like to participate and haven’t already downloaded and installed Skype and registered a Skype ID, you should do those things; then, you'll be able to join in simply by Skyping me (Skype ID: d.l.goldsmith) and I'll add you to the call. So, thanks for your time reading this, and please make time this summer to help us meet (or beat) the goal. Sincerely, David Goldsmith, Technical Editor Olympia, WA ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] How to remove fortran-like loops with numpy?
I look forward to an instructive reply: the Pythonic way to do it would be to take advantage of the facts that Numpy is pre-vectorized and uses broadcasting, but so far I haven't been able to figure out (though I haven't yet really buckled down and tried real hard) how to broadcast a conditionally-terminated iteration where the number of iterations will vary among the array elements. Hopefully someone else already has. :-) DG --- On Mon, 6/8/09, Juanjo Gomez Navarro juanjo.gomeznava...@gmail.com wrote: From: Juanjo Gomez Navarro juanjo.gomeznava...@gmail.com Subject: [Numpy-discussion] How to remove fortran-like loops with numpy? To: numpy-discussion@scipy.org Date: Monday, June 8, 2009, 2:52 PM Hi all, I'm new in numpy. Actually, I'm new in Python. In order to learn a bit, I want to create a program to plot the Mandelbrot set. This program is quite simple, and I have already programmed it. The problem is that I come from fortran, so I use to think in for loops. I know that it is not the best way to use Python and in fact the performance of the program is more than poor. Here is the program: #!/usr/bin/python import numpy as np import matplotlib.pyplot as plt # Some parameters Xmin=-1.5 Xmax=0.5 Ymin=-1 Ymax=1 Ds = 0.01 # Initialization of varibles X = np.arange(Xmin,Xmax,Ds) Y = np.arange(Ymax,Ymin,-Ds) N = np.zeros((X.shape[0],Y.shape[0]),'f') ## Here are inefficient the calculations for i in range(X.shape[0]): for j in range(Y.shape[0]): z= complex(0.0, 0.0) c = complex(X[i], Y[j]) while N[i, j] 30 and abs(z) 2: N[i, j] += 1 z = z**2 + c if N[i, j] == 29: N[i, j]=0 # And now, just for ploting... N = N.transpose() fig = plt.figure() plt.imshow(N,cmap=plt.cm.Blues) plt.title('Mandelbrot set') plt.xticks([]); plt.yticks([]) plt.show() fig.savefig('test.png') As you can see, it is very simple, but it takes several seconds running just to create a 200x200 plot. Fortran takes same time to create a 2000x2000 plot, around 100 times faster... So the question is, do you know how to programme this in a Python-like fashion in order to improve seriously the performance? Thanks in advance -- Juan José Gómez Navarro Edificio CIOyN, Campus de Espinardo, 30100 Departamento de Física Universidad de Murcia Tfno. (+34) 968 398552 Email: juanjo.gomeznava...@gmail.com Web: http://ciclon.inf.um.es/Inicio.html -Inline Attachment Follows- ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots
On Mon, Jun 8, 2009 at 4:47 PM, Christopher Barker chris.bar...@noaa.govwrote: Gael Varoquaux wrote: Click in the menu: 'new file in remote browser', or something like this. If you have editra installed, it will launch it, with a special plugin allowing you to execute selected code in EPDLab. very cool, thanks! -Chris IPython's edit command works in a similar fashion, too. edit test.py open an existing file or creates one, and right after you close the file IPy executes the content. These are from ipy_user_conf.py file: # Configure your favourite editor? # Good idea e.g. for %edit os.path.isfile import ipy_editors # Choose one of these: ipy_editors.scite() #ipy_editors.scite('c:/opt/scite/scite.exe') #ipy_editors.komodo() #ipy_editors.idle() # ... or many others, try 'ipy_editors??' after import to see them Gökhan ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] How to remove fortran-like loops with numpy?
On Mon, Jun 8, 2009 at 17:04, David Goldsmithd_l_goldsm...@yahoo.com wrote: I look forward to an instructive reply: the Pythonic way to do it would be to take advantage of the facts that Numpy is pre-vectorized and uses broadcasting, but so far I haven't been able to figure out (though I haven't yet really buckled down and tried real hard) how to broadcast a conditionally-terminated iteration where the number of iterations will vary among the array elements. Hopefully someone else already has. :-) You can't, really. What you can do is just keep iterating with the whole data set and ignore the parts that have already converged. Here is an example: z = np.zeros((201,201), dtype=complex) Y, X = np.mgrid[1:-1:-201j, -1.5:0.5:201j] c = np.empty_like(z) c.real = X c.imag = Y N = np.zeros(z.shape, dtype=int) while ((N30) | (abs(z)2)).all(): N += abs(z) 2 z = z ** 2 + c N[N=30] = 0 -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] How to remove fortran-like loops with numpy?
2009/6/8 Robert Kern robert.k...@gmail.com: On Mon, Jun 8, 2009 at 17:04, David Goldsmithd_l_goldsm...@yahoo.com wrote: I look forward to an instructive reply: the Pythonic way to do it would be to take advantage of the facts that Numpy is pre-vectorized and uses broadcasting, but so far I haven't been able to figure out (though I haven't yet really buckled down and tried real hard) how to broadcast a conditionally-terminated iteration where the number of iterations will vary among the array elements. Hopefully someone else already has. :-) You can't, really. What you can do is just keep iterating with the whole data set and ignore the parts that have already converged. Here is an example: Well, yes and no. This is only worth doing if the number of problem points that require many iterations is small - not the case here without some sort of periodicity detection - but you can keep an array of not-yet-converged points, which you iterate. When some converge, you store them in a results array (with fancy indexing) and remove them from your still-converging array. It's also worth remembering that the overhead of for loops is large but not enormous, so you can often remove only the inner for loop, in this case perhaps iterating over the image a line at a time. Anne z = np.zeros((201,201), dtype=complex) Y, X = np.mgrid[1:-1:-201j, -1.5:0.5:201j] c = np.empty_like(z) c.real = X c.imag = Y N = np.zeros(z.shape, dtype=int) while ((N30) | (abs(z)2)).all(): N += abs(z) 2 z = z ** 2 + c N[N=30] = 0 -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] How to remove fortran-like loops with numpy?
Thanks, Robert! DG --- On Mon, 6/8/09, Robert Kern robert.k...@gmail.com wrote: I haven't been able to figure out (though I haven't yet really buckled down and tried real hard) how to broadcast a conditionally-terminated iteration where the number of iterations will vary among the array elements. Hopefully someone else already has. :-) You can't, really. What you can do is just keep iterating with the whole data set and ignore the parts that have already converged. Here is an example: z = np.zeros((201,201), dtype=complex) Y, X = np.mgrid[1:-1:-201j, -1.5:0.5:201j] c = np.empty_like(z) c.real = X c.imag = Y N = np.zeros(z.shape, dtype=int) while ((N30) | (abs(z)2)).all(): N += abs(z) 2 z = z ** 2 + c N[N=30] = 0 -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] How to remove fortran-like loops with numpy?
--- On Mon, 6/8/09, Anne Archibald peridot.face...@gmail.com wrote: You can't, really. What you can do is just keep iterating with the whole data set and ignore the parts that have already converged. Here is an example: Well, yes and no. This is only worth doing if the number of problem points that require many iterations is small - not the case here without some sort of periodicity detection - but you can keep an array of not-yet-converged points, which you iterate. When some converge, you store them in a results array (with fancy indexing) and remove them from your still-converging array. Thanks, Anne. This is the way I had anticipated implementing it myself eventually, but the fancy-indexing requirement has caused me to keep postponing it, waiting for some time when I'll have a hefty block of time to figure it out and then, inevitably, debug it. :( Also, the transfer of points from un-converged to converged - when that's a large number, might that not be a large time-suck compared to Rob's method? (Too bad this wasn't posted a couple weeks ago: I'd've had time then to implement your method and race it against Rob's, but alas, now I have this doc editing job...but that's a good thing, as my fractals are not yet making me any real money.) :-) It's also worth remembering that the overhead of for loops is large but not enormous, so you can often remove only the inner for loop, in this case perhaps iterating over the image a line at a time. Yes, definitely well worth remembering - thanks for reminding us! Thanks again, DG Anne ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] How to remove fortran-like loops with numpy?
On Mon, Jun 8, 2009 at 18:01, David Goldsmithd_l_goldsm...@yahoo.com wrote: --- On Mon, 6/8/09, Anne Archibald peridot.face...@gmail.com wrote: You can't, really. What you can do is just keep iterating with the whole data set and ignore the parts that have already converged. Here is an example: Well, yes and no. This is only worth doing if the number of problem points that require many iterations is small - not the case here without some sort of periodicity detection - but you can keep an array of not-yet-converged points, which you iterate. When some converge, you store them in a results array (with fancy indexing) and remove them from your still-converging array. Thanks, Anne. This is the way I had anticipated implementing it myself eventually, but the fancy-indexing requirement has caused me to keep postponing it, waiting for some time when I'll have a hefty block of time to figure it out and then, inevitably, debug it. :( Also, the transfer of points from un-converged to converged - when that's a large number, might that not be a large time-suck compared to Rob's method? (Too bad this wasn't posted a couple weeks ago: I'd've had time then to implement your method and race it against Rob's, but alas, now I have this doc editing job...but that's a good thing, as my fractals are not yet making me any real money.) :-) The advantage of my implementation is that I didn't have to think too hard about it. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Definitions of pv, fv, nper, pmt, and rate
On Mon, Jun 8, 2009 at 5:44 PM, d_l_goldsm...@yahoo.com wrote: Hi, folks. Unable to find a printed reference for the definition we use to compute the functions in the Subject line of this email, I posted a couple queries for help in this regard in the Discussion for fv (http://docs.scipy.org/numpy/docs/numpy.lib.financial.fv/#discussion-sec). josef Pktd's reply (thanks!) just makes me even more doubtful that we're using the definition that most users from the financial community would be expecting. At this point, I have to say, I'm very concerned that our implementation for these is wrong (or at least inconsistent with what's used in financial circles); if you know of a reference - less ephemeral than a solely electronic document - defining these functions as we've implemented them, please share. Thanks! Just quickly comparing In [3]: np.lib.financial.fv(.1,10,-100,-350) Out[3]: 2501.5523211350032 With OO Calc =fv(.1,10,-100,-350) =2501.55 Both return the value of 350*1.1**10 + 100*1.1**9 + ... + 100*1.1 which is what I would expect it to do. I didn't look too closely at the docs though, so they might be a bit confusing and need some cleaning up. There was a recent discussion about numpy.financial in this thread http://mail.scipy.org/pipermail/numpy-discussion/2009-May/042709.html. The way that it was left is that they are there as teaching tools to mimic *some* of the functionality of spreadsheets/ financials calculators. I'm currently working on implementing some other common spreadsheet/ financial calculator on my own for possible inclusion somewhere later, as I think was the original vision http://thread.gmane.org/gmane.comp.python.numeric.general/20027. Skipper ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots
On 09/06/09 00:16, Gael Varoquaux wrote: On Mon, Jun 08, 2009 at 05:14:27PM -0500, Gökhan SEVER wrote: IPython's edit command works in a similar fashion, too. edit test.py The cool thing is that you can select text in the editor and execute in EPDLab. On the other hand, I know that IPython has hooks to grow this in the code base, and I would like this to grow also directly in IPython. Hell, I use vim. How cool would it be to select (using visual mode) snippets in vim, and execute them in a running Ipython session. I think there's a vim script for executing the marked code in python. If IPython has already hooks for executing code in an existing session, it might be possible to adapt this script. Also I encourage everyone to have a look at pida: http://pida.co.uk/ which is a python IDE using an embedded vim (although you can embed other editors as well I think). The website looks like development has been stale, but if you look at svn there've been commits lately. Cheers Jochen Gaël ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] How to remove fortran-like loops with numpy?
Hi Juan 2009/6/8 Juanjo Gomez Navarro juanjo.gomeznava...@gmail.com: I'm new in numpy. Actually, I'm new in Python. In order to learn a bit, I want to create a program to plot the Mandelbrot set. This program is quite simple, and I have already programmed it. The problem is that I come from fortran, so I use to think in for loops. I know that it is not the best way to use Python and in fact the performance of the program is more than poor. Here is the program: #!/usr/bin/python import numpy as np import matplotlib.pyplot as plt # Some parameters Xmin=-1.5 Xmax=0.5 Ymin=-1 Ymax=1 Ds = 0.01 # Initialization of varibles X = np.arange(Xmin,Xmax,Ds) Y = np.arange(Ymax,Ymin,-Ds) N = np.zeros((X.shape[0],Y.shape[0]),'f') ## Here are inefficient the calculations for i in range(X.shape[0]): for j in range(Y.shape[0]): z= complex(0.0, 0.0) c = complex(X[i], Y[j]) while N[i, j] 30 and abs(z) 2: N[i, j] += 1 z = z**2 + c if N[i, j] == 29: N[i, j]=0 # And now, just for ploting... N = N.transpose() fig = plt.figure() plt.imshow(N,cmap=plt.cm.Blues) plt.title('Mandelbrot set') plt.xticks([]); plt.yticks([]) plt.show() fig.savefig('test.png') As you can see, it is very simple, but it takes several seconds running just to create a 200x200 plot. Fortran takes same time to create a 2000x2000 plot, around 100 times faster... So the question is, do you know how to programme this in a Python-like fashion in order to improve seriously the performance? Here is another version, similar to Robert's, that I wrote up for the documentation project last year: http://mentat.za.net/numpy/intro/intro.html We never used it, but I still like the pretty pictures :-) Cheers Stéfan ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] From CorePy: New ExtBuffer object
Hi, Just a heads-up on something they're talking about over at CorePy. Regards Stéfan -- Forwarded message -- From: Andrew Friedley afrie...@osl.iu.edu Date: 2009/6/8 Subject: [Corepy-devel] New ExtBuffer object To: CorePy Development corepy-de...@osl.iu.edu I wrote a new buffer object today, called ExtBuffer, that can be used with libraries/objects that support the Python 2.6 buffer interface (e.g. NumPy). This brings page-aligned memory (and huge-page) support to anything that can use a buffer object (eg NumPy arrays). ExtBuffer can also be initialized using a pointer to an existing memory region. This allows you, for example, to set up a NumPy array spanning a Cell SPU's memory mapped local store, accessing LS like any other NumPy array. The ExtBuffer is included as part of the 'corepy.lib.extarray' module, and can be used like this: import corepy.lib.extarray as extarray import numpy buf = extarray.extbuffer(4096, huge = True) array = numpy.frombuffer(buf, dtype=numpy.int32) I wrote a some documentation here: http://corepy.org/wiki/index.php?title=Extended_Array If anyone has any questions, thoughts, ideas, bugs, etc, please let me know! Andrew ___ Corepy-devel mailing list corepy-de...@osl.iu.edu http://www.osl.iu.edu/mailman/listinfo.cgi/corepy-devel ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] How to remove fortran-like loops with numpy?
--- On Mon, 6/8/09, Robert Kern robert.k...@gmail.com wrote: Goldsmithd_l_goldsm...@yahoo.com wrote: The advantage of my implementation is that I didn't have to think too hard about it. -- Robert Kern Agreed. :-) DG ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Definitions of pv, fv, nper, pmt, and rate
--- On Mon, 6/8/09, Skipper Seabold jsseab...@gmail.com wrote: There was a recent discussion about numpy.financial in this thread http://mail.scipy.org/pipermail/numpy-discussion/2009-May/042709.html. Skipper Thanks, Skipper. Having now read that thread (but not the arguments, provided elsewhere, for the existence of numpy.financial in the first place), and considering that the only references mentioned there are also electronic ones (which, for the purpose of referencing sources in the function docs, I believe we're wanting to shun as much as possible), I formally move that numpy.financial (or at least that subset of it consisting of functions which are commonly subject to multiple definitions) be moved out of numpy. (Where _to_ exactly, I cannot say.) DG ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Definitions of pv, fv, nper, pmt, and rate
On 6/8/2009 11:18 PM David Goldsmith apparently wrote: I formally move that numpy.financial (or at least that subset of it consisting of functions which are commonly subject to multiple definitions) be moved out of numpy. My recollection is that Travis O. added this with the explicit intent of seducing users who might otherwise turn to spreadsheets for such functionality. I.e., it was part of an effort to extend the net of the NumPy community. I am not urging a case one way or another, although I am very sympathetic to that reasoning, whether or not I am correctly recalling the actual motivation. In that light, however, standard spreadsheet definitions would be the proper guide. E.g., the definitions used by Gnumeric. Cheers, Alan Isaac ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Definitions of pv, fv, nper, pmt, and rate
So would we regard a hard-copy of the users guide or reference manual for such a spreadsheet as sufficiently permanent to pass muster for use as a reference? DG --- On Mon, 6/8/09, Alan G Isaac ais...@american.edu wrote: From: Alan G Isaac ais...@american.edu Subject: Re: [Numpy-discussion] Definitions of pv, fv, nper, pmt, and rate To: Discussion of Numerical Python numpy-discussion@scipy.org Date: Monday, June 8, 2009, 8:40 PM On 6/8/2009 11:18 PM David Goldsmith apparently wrote: I formally move that numpy.financial (or at least that subset of it consisting of functions which are commonly subject to multiple definitions) be moved out of numpy. My recollection is that Travis O. added this with the explicit intent of seducing users who might otherwise turn to spreadsheets for such functionality. I.e., it was part of an effort to extend the net of the NumPy community. I am not urging a case one way or another, although I am very sympathetic to that reasoning, whether or not I am correctly recalling the actual motivation. In that light, however, standard spreadsheet definitions would be the proper guide. E.g., the definitions used by Gnumeric. Cheers, Alan Isaac ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Definitions of pv, fv, nper, pmt, and rate
On Tue, Jun 9, 2009 at 12:18 AM, Robert Kernrobert.k...@gmail.com wrote: On Mon, Jun 8, 2009 at 22:54, David Goldsmithd_l_goldsm...@yahoo.com wrote: So would we regard a hard-copy of the users guide or reference manual for such a spreadsheet as sufficiently permanent to pass muster for use as a reference? The OpenFormula standard is probably better: http://www.oasis-open.org/committees/documents.php?wg_abbrev=office-formula This is a nice reference. There are notes for which packages agree/disagree, proprietary and open source, and values for tests. Skipper ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Definitions of pv, fv, nper, pmt, and rate
--- On Mon, 6/8/09, Skipper Seabold jsseab...@gmail.com wrote: I forgot the last payment (which doesn't earn any interest), so one more 100. So in fact they're not in agreement? pretty soon. I don't have a more permanent reference for fv offhand, but it should be in any corporate finance text etc. Most of these type of formulas use basic results of geometric series to simplify. Let me be more specific about the difference between what we have and what I'm finding in print. Essentially, it boils down to this: in every source I've found, two different present/future values are discussed, that for a single amount, and that for a constant (i.e., not even the first payment is allowed to be different) periodic payment. I have not been able to find a single printed reference that gives a formula for (or even discusses, for that matter) the combination of these two, which is clearly what we have implemented (and which is, just as clearly, actually seen in practice). Now, my lazy side simply hopes that my stridency will finally cause someone to pipe up and say look, dummy, it's in Schmoe, Joe, 2005. Advanced Financial Practice. Financial Press, NY NY. There's your reference; find it and look it up if you don't trust me and then I'll feel like we've at least covered our communal rear-end. But my more conscientious side worries that, if I've had so much trouble finding our more advanced definition (and I have tried, believe me), then I'm concerned that what your typical student (for example) is most likely to encounter is one of those simpler definitions, and thus get confused (at best) if they look at our help doc and find quite a different (at least superficially) definition (or worse, don't look at the help doc, and either can't get the function to work because the required number of inputs doesn't match what they're expecting from their text, or somehow manage to get it to work, but get an answer very different from that given in other sources, e.g., the answers in the back of their text.) One obvious answer to this dilemma is to explain this discrepancy in the help doc, but then we have to explain - clearly and lucidly, mind you - how one uses our functions for the two simpler cases, how/why the formula we use is the combination of the other two, etc. (it's rather hard to anticipate, for me at least, all the possible confusions this discrepancy might create) and in any event, somehow I don't really think something so necessarily elaborate is appropriate in this case. So, again, given that fv and pv (and by extension, nper, pmt, and rate) have multiple definitions floating around out there, I sincerely think we should punt (my apologies to those unfamiliar w/ the American football metaphor), i.e., rid ourselves of this nightmare, esp. in light of what I feel are compelling, independent arguments against the inclusion of these functions in this library in the first place. Sorry for my stridency, and thank you for your time and patience. DG Skipper ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Interactive Shell/Editor/Workspace(variables)View/Plots
On Tue, Jun 09, 2009 at 01:10:24PM +1200, Jochen Schroeder wrote: On 09/06/09 00:16, Gael Varoquaux wrote: On Mon, Jun 08, 2009 at 05:14:27PM -0500, Gökhan SEVER wrote: IPython's edit command works in a similar fashion, too. edit test.py The cool thing is that you can select text in the editor and execute in EPDLab. On the other hand, I know that IPython has hooks to grow this in the code base, and I would like this to grow also directly in IPython. Hell, I use vim. How cool would it be to select (using visual mode) snippets in vim, and execute them in a running Ipython session. I think there's a vim script for executing the marked code in python. If IPython has already hooks for executing code in an existing session, it might be possible to adapt this script. I do think it is, and that's just what I was suggesting. Now, I don't have time for that, but if someone feels like... :) Gaël ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion