Re: [Numpy-discussion] Py3k: making a py3k compat header available in installed numpy for scipy
2010/3/30 David Cournapeau da...@silveregg.co.jp Pauli Virtanen wrote: [clip] At least, I don't see what I would like to change there. The only thing I wouldn't perhaps like to have in the long run are the PyString and possibly PyInt redefinition macros. I would also prefer a new name, instead of macro redefinition, but I can do it. For strings, the new names are PyBytes, PyUnicode and PyUString (unicode on Py3, bytes on Py2). One of these should be used instead of PyString in all places, but redefining the macro reduced the work in making a quick'n'dirty port of numpy. For integers, perhaps a separate PyInt on Py2, PyLong on Py3 type should be defined. Not sure if this would reduce work in practice. The header would not be part of the public API proper anyway (I will put it somewhere else than numpy/include), it should never be pulled implicitly in when including one of the .h in numpy/include. Agreed. Pauli ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Making a 2to3 distutils command ?
2010/3/30 David Cournapeau da...@silveregg.co.jp: Currently, when building numpy with python 3, the 2to3 conversion happens before calling any distutils command. Was there a reason for doing it as it is done now ? This allowed 2to3 to also port the various setup*.py files and numpy.distutils, and implementing it this way required the minimum amount of work and understanding of distutils -- you need to force it to proceed with the build using the set of output files from 2to3. I would like to make a proper numpy.distutils command for it, so that it can be more finely controlled (in particular, using the -j option). It would also avoid duplication in scipy. Are you sure you want to mix distutils in this? Wouldn't it only obscure how things work? If the aim is in making the 2to3 processing reusable, I'd rather simply move tools/py3tool.py under numpy.distutils (+ perhaps do some cleanups), and otherwise keep it completely separate from distutils. It could be nice to have the 2to3 conversion parallelizable, but there are probably simple ways to do it without mixing distutils in. But if you think this is really worth doing, go ahead. Pauli ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.distutils/f2py: forcing 8-bit reals
David Warde-Farley wrote: Hi, In my setup.py, I have from numpy.distutils.misc_util import Configuration fflags= '-fdefault-real-8 -ffixed-form' config = Configuration( 'foo', parent_package=None, top_path=None, f2py_options='--f77flags=\'%s\' --f90flags=\'%s\'' % (fflags, fflags) ) However I am still getting stuff returned in 'real' variables as dtype=float32. Am I doing something wrong? Unless f2py is (too) smart, it probably just pass along --f77flags and --f90flags to the Fortran compiler, but don't use them when creating the wrapping C Python extension. So, Fortran uses real(8) and the type in C is float -- and what NumPy ends up seeing is float32. Unless you're dealing with arrays, you are likely blowing your stack here... I wouldn't think there's a way around this except fixing the original source. f2py is very much based on assumptions about type sizes which you then violate when passing -fdefault-real-8. Dag Sverre ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.distutils/f2py: forcing 8-bit reals
Dag Sverre Seljebotn wrote: David Warde-Farley wrote: Hi, In my setup.py, I have from numpy.distutils.misc_util import Configuration fflags= '-fdefault-real-8 -ffixed-form' config = Configuration( 'foo', parent_package=None, top_path=None, f2py_options='--f77flags=\'%s\' --f90flags=\'%s\'' % (fflags, fflags) ) However I am still getting stuff returned in 'real' variables as dtype=float32. Am I doing something wrong? Unless f2py is (too) smart, it probably just pass along --f77flags and --f90flags to the Fortran compiler, but don't use them when creating the wrapping C Python extension. So, Fortran uses real(8) and the type in C is float -- and what NumPy ends up seeing is float32. Unless you're dealing with arrays, you are likely blowing your stack here... I wouldn't think there's a way around this except fixing the original source. f2py is very much based on assumptions about type sizes which you then violate when passing -fdefault-real-8. Well, you can pass -fdefault-real-8 and then write .pyf headers where real(8) is always given explicitly. Dag Sverre ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Making a 2to3 distutils command ?
On Tue, Mar 30, 2010 at 1:09 AM, Pauli Virtanen p...@iki.fi wrote: 2010/3/30 David Cournapeau da...@silveregg.co.jp: Currently, when building numpy with python 3, the 2to3 conversion happens before calling any distutils command. Was there a reason for doing it as it is done now ? This allowed 2to3 to also port the various setup*.py files and numpy.distutils, and implementing it this way required the minimum amount of work and understanding of distutils -- you need to force it to proceed with the build using the set of output files from 2to3. I would like to make a proper numpy.distutils command for it, so that it can be more finely controlled (in particular, using the -j option). It would also avoid duplication in scipy. Are you sure you want to mix distutils in this? Wouldn't it only obscure how things work? If the aim is in making the 2to3 processing reusable, I'd rather simply move tools/py3tool.py under numpy.distutils (+ perhaps do some cleanups), and otherwise keep it completely separate from distutils. It could be nice to have the 2to3 conversion parallelizable, but there are probably simple ways to do it without mixing distutils in. Out of curiosity, is there something wrong with the support for 2to3 that already exists within distutils? (Other than it just being distutils) http://bruynooghe.blogspot.com/2010/03/using-lib2to3-in-setuppy.html Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Making a 2to3 distutils command ?
ti, 2010-03-30 kello 07:18 -0600, Ryan May kirjoitti: Out of curiosity, is there something wrong with the support for 2to3 that already exists within distutils? (Other than it just being distutils) http://bruynooghe.blogspot.com/2010/03/using-lib2to3-in-setuppy.html That AFAIK converts only those Python files that will be installed, and I don't know how to tell it to disable some conversion on a per-file basis. (Numpy also contains some files necessary for building that will not be installed...) But OK, to be honest, I didn't look closely if one could make it work using the bundled 2to3 command. Things might be cleaner if it can be made to work. Pauli ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] SciPy 2010: Vote on Tutorials Sprints
Email not displaying correctly? View it in your browser. Hello Amenity, Spring is upon us and arrangements for SciPy 2010 are in full swing. We're already nearing on some important deadlines for conference participants: April 11th is the deadline for submitting an abstract for a paper, and April 15th is the deadline for submitting a tutorial proposal. Help choose tutorials for SciPy 2010... We set up a UserVoice page to brainstorm tutorial topics last week and we already have some great ideas. The top ones at the moment are: Effective multi-core programming with Cython and Python Building your own app with Mayavi High performance computing with Python Propose your own or vote on the existing suggestions here. ...Or instruct a tutorial and cover your conference costs. Did you know that we're awarding generous stipends to tutorial instructors this year? So if you believe you could lead a tutorial, by all means submit your proposal — soon! They're due April 15th. Call for Papers Continues Submitting a paper for to present at SciPy 2010 is easy, so remember to prepare one and have your friends and colleagues follow suit. Send us your abstract before April 11th and let us know whether you'd like to speak at the main conference or one of the specialized tracks. Details here. Have you registered? Booking your tickets early should save you money — not to mention the early registration prices you will qualify for if you register before May 10th. Best, The SciPy 2010 Team @SciPy2010 on Twitter You are receiving this email because you have registered for the SciPy 2010 conference in Austin, TX. Unsubscribe amen...@enthought.com from this list | Forward to a friend | Update your profile Our mailing address is: Enthought, Inc. 515 Congress Ave. Austin, TX 78701 Add us to your address book Copyright (C) 2010 Enthought, Inc. All rights reserved. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] indexing question
This one bit me again, and I am trying to understand it better so I can anticipate when it will happen. What I want to do is get rid of singleton dimensions, and index into the last dimension with an array. In [1]: import numpy as np In [2]: x=np.zeros((10,1,1,1,14,1024)) In [3]: x[:,0,0,0,:,[1,2,3]].shape Out[3]: (3, 10, 14) Whoa! Trimming my array to a desired number ends up moving the last dimension to the first! In [4]: np.__version__ Out[4]: '1.3.0' ... In [7]: x[:,:,:,:,:,[1,2,3]].shape Out[7]: (10, 1, 1, 1, 14, 3) This looks right... In [8]: x[...,[1,2,3]].shape Out[8]: (10, 1, 1, 1, 14, 3) and this... In [9]: x[...,[1,2,3]][:,0,0,0].shape Out[9]: (10, 14, 3) ... In [11]: x[:,0,0,0][...,[1,2,3]].shape Out[11]: (10, 14, 3) Either of the last 2 attempts above results in what I want, so I can do that... I just need some help deciphering when and why the first thing happens. -- View this message in context: http://old.nabble.com/indexing-question-tp28083162p28083162.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] indexing question
On 3/30/2010 10:13 AM, Tom K. wrote: What I want to do is get rid of singleton dimensions, and index into the last dimension with an array. x=np.zeros((10,1,1,1,14,1024)) np.squeeze(x).shape (10, 14, 1024) hth, Alan Isaac ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] indexing question
On Tue, Mar 30, 2010 at 8:13 AM, Tom K. t...@kraussfamily.org wrote: This one bit me again, and I am trying to understand it better so I can anticipate when it will happen. What I want to do is get rid of singleton dimensions, and index into the last dimension with an array. In [1]: import numpy as np In [2]: x=np.zeros((10,1,1,1,14,1024)) In [3]: x[:,0,0,0,:,[1,2,3]].shape Out[3]: (3, 10, 14) Hmm... That doesn't look right. snip Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] indexing question
On Tue, Mar 30, 2010 at 09:46, Charles R Harris charlesr.har...@gmail.com wrote: On Tue, Mar 30, 2010 at 8:13 AM, Tom K. t...@kraussfamily.org wrote: This one bit me again, and I am trying to understand it better so I can anticipate when it will happen. What I want to do is get rid of singleton dimensions, and index into the last dimension with an array. In [1]: import numpy as np In [2]: x=np.zeros((10,1,1,1,14,1024)) In [3]: x[:,0,0,0,:,[1,2,3]].shape Out[3]: (3, 10, 14) Hmm... That doesn't look right. It's a known feature. Slicing and list indexing are separate subsystems. The list indexing takes priority so the list-indexed axes end up first in the result. The sliced axes follow them. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] indexing question
On Tue, Mar 30, 2010 at 10:13 AM, Tom K. t...@kraussfamily.org wrote: This one bit me again, and I am trying to understand it better so I can anticipate when it will happen. What I want to do is get rid of singleton dimensions, and index into the last dimension with an array. In [1]: import numpy as np In [2]: x=np.zeros((10,1,1,1,14,1024)) In [3]: x[:,0,0,0,:,[1,2,3]].shape Out[3]: (3, 10, 14) Whoa! Trimming my array to a desired number ends up moving the last dimension to the first! In [4]: np.__version__ Out[4]: '1.3.0' ... In [7]: x[:,:,:,:,:,[1,2,3]].shape Out[7]: (10, 1, 1, 1, 14, 3) This looks right... In [8]: x[...,[1,2,3]].shape Out[8]: (10, 1, 1, 1, 14, 3) and this... In [9]: x[...,[1,2,3]][:,0,0,0].shape Out[9]: (10, 14, 3) ... In [11]: x[:,0,0,0][...,[1,2,3]].shape Out[11]: (10, 14, 3) Either of the last 2 attempts above results in what I want, so I can do that... I just need some help deciphering when and why the first thing happens. An explanation about the surprising behavior when slicing and fancy indexing is mixed with more than 2 dimensions is in this thread http://www.mail-archive.com/numpy-discussion@scipy.org/msg16299.html More examples show up every once in a while on the mailing list. Josef -- View this message in context: http://old.nabble.com/indexing-question-tp28083162p28083162.html Sent from the Numpy-discussion mailing list archive at Nabble.com. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Set values of a matrix within a specified range to zero
Hello all, I'm relatively new to numpy. I'm working with text images as 512x512 arrays. I would like to set elements of the array whose value fall within a specified range to zero (eg 23 x 45). Any advice is much appreciated. Sean ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Set values of a matrix within a specified range to zero
On 3/30/2010 12:56 PM, Sean Mulcahy wrote: 512x512 arrays. I would like to set elements of the array whose value fall within a specified range to zero (eg 23 x 45). x[(23x)*(x45)]=0 hth, Alann Isaac ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Set values of a matrix within a specified range to zero
On Tue, Mar 30, 2010 at 11:12 AM, Alan G Isaac ais...@american.edu wrote: On 3/30/2010 12:56 PM, Sean Mulcahy wrote: 512x512 arrays. I would like to set elements of the array whose value fall within a specified range to zero (eg 23 x 45). x[(23x)*(x45)]=0 Or a version that seems a bit more obvious (doing a multiply between boolean arrays to get an AND operator seems a tad odd): x[(23x) (x45)] = 0 Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.distutils/f2py: forcing 8-bit reals
Hey Dag, On 30-Mar-10, at 5:02 AM, Dag Sverre Seljebotn wrote: Well, you can pass -fdefault-real-8 and then write .pyf headers where real(8) is always given explicitly. Actually I've gotten it to work this way, with real(8) in the wrappers. BUT... for some reason it requires me to set the environment variable G77='gfortran -fdefault-real-8'. Simply putting it in f2py_options='-- f77flags=\'-fdefault-real-8\' --f90flags=\'-fdefault-real-8\'' doesn't seem to do the trick. I don't really understand why. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] numpy.distutils/f2py: forcing 8-bit reals
On 30-Mar-10, at 2:14 PM, David Warde-Farley wrote: Hey Dag, On 30-Mar-10, at 5:02 AM, Dag Sverre Seljebotn wrote: Well, you can pass -fdefault-real-8 and then write .pyf headers where real(8) is always given explicitly. Actually I've gotten it to work this way, with real(8) in the wrappers. BUT... for some reason it requires me to set the environment variable G77='gfortran -fdefault-real-8'. Simply putting it in f2py_options='--f77flags=\'-fdefault-real-8\' --f90flags=\'-fdefault- real-8\'' doesn't seem to do the trick. I don't really understand why. Sorry, that should say F77='gfortran -fdefault-real-8'. Setting G77 obviously doesn't do anything. ;) Without that environment variable, I think it is blowing the stack; the program just hangs. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Replacing NANs
Hi, In an array I want to replace all NANs with some number say 100, I found a method* **nan_to_num *but it only replaces with zero. Any solution for this? * *Thanks Vishal ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Replacing NANs
On 30 March 2010 14:59, Vishal Rana ranavis...@gmail.com wrote: Hi, In an array I want to replace all NANs with some number say 100, I found a method nan_to_num but it only replaces with zero. Any solution for this? ar[np.isnan(ar)] = my_num where ar is your array and my_num is the number you want to replace nans with. Angus. Thanks Vishal ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion -- AJC McMorland Post-doctoral research fellow Neurobiology, University of Pittsburgh ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Replacing NANs
In an array I want to replace all NANs with some number say 100, I found a method nan_to_num but it only replaces with zero. Any solution for this? Indexing with a mask is one approach here: a[numpy.isnan(a)] = 100 also cf. numpy.isfinite as well in case you want the same with infs. Zach On Mar 30, 2010, at 2:59 PM, Vishal Rana wrote: Hi, In an array I want to replace all NANs with some number say 100, I found a method nan_to_num but it only replaces with zero. Any solution for this? Thanks Vishal ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Replacing NANs
That was quick! Thanks Angus and Zachary On Tue, Mar 30, 2010 at 11:59 AM, Vishal Rana ranavis...@gmail.com wrote: Hi, In an array I want to replace all NANs with some number say 100, I found a method* **nan_to_num *but it only replaces with zero. Any solution for this? * *Thanks Vishal ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Interpolation question
Andrea Gavana andrea.gavana at gmail.com writes: Hi Kevin, On 29 March 2010 01:38, Kevin Dunn wrote: Message: 5 Date: Sun, 28 Mar 2010 00:24:01 + From: Andrea Gavana andrea.gavana at gmail.com Subject: [Numpy-discussion] Interpolation question To: Discussion of Numerical Python numpy-discussion at scipy.org Message-ID: d5ff27201003271724o6c82ec75v225d819c84140b46 at mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1 Hi All, I have an interpolation problem and I am having some difficulties in tackling it. I hope I can explain myself clearly enough. Basically, I have a whole bunch of 3D fluid flow simulations (close to 1000), and they are a result of different combinations of parameters. I was planning to use the Radial Basis Functions in scipy, but for the moment let's assume, to simplify things, that I am dealing only with one parameter (x). In 1000 simulations, this parameter x has 1000 values, obviously. The problem is, the outcome of every single simulation is a vector of oil production over time (let's say 40 values per simulation, one per year), and I would like to be able to interpolate my x parameter (1000 values) against all the simulations (1000x40) and get an approximating function that, given another x parameter (of size 1x1) will give me back an interpolated production profile (of size 1x40). [I posted the following earlier but forgot to change the subject - it appears as a new thread called NumPy-Discussion Digest, Vol 42, Issue 85 - please ignore that thread] Andrea, may I suggest a different approach to RBF's. Realize that your vector of 40 values for each row in y are not independent of each other (they will be correlated). First build a principal component analysis (PCA) model on this 1000 x 40 matrix and reduce it down to a 1000 x A matrix, called your scores matrix, where A is the number of independent components. A is selected so that it adequately summarizes Y without over-fitting and you will find A 40, maybe A = 2 or 3. There are tools, such as cross-validation, that will help select a reasonable value of A. Then you can relate your single column of X to these independent columns in A using a tool such as least squares: one least squares model per column in the scores matrix. This works because each column in the score vector is independent (contains totally orthogonal information) to the others. But I would be surprised if this works well enough, unless A = 1. But it sounds like your don't just have a single column in your X-variables (you hinted that the single column was just for simplification). In that case, I would build a projection to latent structures model (PLS) model that builds a single latent-variable model that simultaneously models the X-matrix, the Y-matrix as well as providing the maximal covariance between these two matrices. If you need some references and an outline of code, then I can readily provide these. This is a standard problem with data from spectroscopic instruments and with batch processes. They produce hundreds, sometimes 1000's of samples per row. PCA and PLS are very effective at summarizing these down to a much smaller number of independent columns, very often just a handful, and relating them (i.e. building a predictive model) to other data matrices. I also just saw the suggestions of others to center the data by subtracting the mean from each column in Y and scaling (by dividing through by the standard deviation). This is a standard data preprocessing step, called autoscaling and makes sense for any data analysis, as you already discovered. I have got some success by using time-based RBFs interpolations, but I am always open to other possible implementations (as the one I am using can easily fail for strange combinations of input parameters). Unfortunately, my understanding of your explanation is very very limited: I am not an expert at all, so it's a bit hard for me to translate the mathematical technical stuff in something I can understand. If you have an example code (even a very trivial one) for me to study so that I can understand what the code is actually doing, I would be more than grateful for your help PCA can be calculated in several different ways. The code below works well enough in many cases, but it can't handle missing data. If you need that capability, let me know. You can read more about PCA here (shameless plug for the course I teach on this material): http://stats4.eng.mcmaster.ca/wiki/Latent_variable_methods_and_applications And here is some Python code for a fake data set and an actual data set. Hope that gets you started, Kevin. import numpy as np use_fake_data = False if use_fake_data: N = 1000 K = 40 A = 3 # extract 3 principal components # Replace this garbage data with your actual, raw data X_raw =
Re: [Numpy-discussion] numpy.distutils/f2py: forcing 8-bit reals
On 30-Mar-10, at 5:02 AM, Dag Sverre Seljebotn wrote: Well, you can pass -fdefault-real-8 and then write .pyf headers where real(8) is always given explicitly. Okay, the answer (without setting the F77 environment variable) is basically to expect real-8's in the .pyf file and compile the whole package with python setup.py config_fc --fcompiler=gnu95 --f77flags='-fdefault- real-8' --f90flags='-fdefault-real-8' build Still not sure if there's a sane way to make this the default in my setup.py (I found some old mailing list posts where Pearu suggests modifying sys.argv but I think this is widely considered a bad idea). The right way might involve instantiating a GnuF95Compiler object and messing with its fields or something like that. Anyway, this works for now. Thanks, David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Interpolation question
2010/3/30 Andrea Gavana andrea.gav...@gmail.com: On 29 March 2010 23:44, Friedrich Romstedt wrote: When you have nice results using 40 Rbfs for each time instant, this procedure means that the values for one time instant will not be influenced by adjacent-year data. I.e., you would probably get the same result using a norm extraordinary blowing up the time coordinate. To make it clear in code, when the time is your first coordinate, and you have three other coordinates, the *norm* would be: def norm(x1, x2): return numpy.sqrtx1 - x2) * [1e3, 1, 1]) ** 2).sum()) In this case, the epsilon should be fixed, to avoid the influence of the changing distances on the epsilon determination inside of Rbf, which would spoil the whole thing. Of course, it are here two and not three other variables. I have an idea how to tune your model: Take, say, the half or three thirds of your simulation data as interpolation database, and try to reproduce the remaining part. I have some ideas how to tune using this in practice. Here, of course it are three quarters and not three thirds :-) This is a very good idea indeed: I am actually running out of test cases (it takes a while to run a simulation, and I need to do it every time I try a new combination of parameters to check if the interpolation is good enough or rubbish). I'll give it a go tomorrow at work and I'll report back (even if I get very bad results :-D ). I refined the idea a bit. Select one simulation, and use the complete rest as the interpolation base. Then repreat this for each simualation. Calculate some joint value for all the results, the simplest would maybe be, to calculate: def joint_ln_density(simulation_results, interpolation_results): return -((interpolation_results - simulation_results) ** 2) / (simulation_results ** 2) In fact, this calculates the logarithm of the Gaussians centered at *simulation_results* and taken at the obervations *interpolation_results*. It is the logarithms of the product of this Gaussians. The standard deviation of the Gaussians is assumed to be the value of the *simulation_results*, which means, that I assume that low-valued outcomes are much more precise in absolute numbers than high-outcome values, but /relative/ to their nominal value they are all the same precise. (NB: A scaling of the stddevs wouldn't make a significant difference /for you/. Same the neglected coefficients of the Gaussians.) I don't know, which method you like the most. Robert's and Kevin's proposals are hard to compete with ... You could optimise (maximise) the joint_ln_density outcome as a function of *epsilon* and the different scalings. afaic, scipy comes with some optimisation algorithms included. I checked it: http://docs.scipy.org/doc/scipy-0.7.x/reference/optimize.html#general-purpose . Friedrich ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Interpolation question
Hi Friedrich All, On 30 March 2010 21:48, Friedrich Romstedt wrote: 2010/3/30 Andrea Gavana andrea.gav...@gmail.com: On 29 March 2010 23:44, Friedrich Romstedt wrote: When you have nice results using 40 Rbfs for each time instant, this procedure means that the values for one time instant will not be influenced by adjacent-year data. I.e., you would probably get the same result using a norm extraordinary blowing up the time coordinate. To make it clear in code, when the time is your first coordinate, and you have three other coordinates, the *norm* would be: def norm(x1, x2): return numpy.sqrtx1 - x2) * [1e3, 1, 1]) ** 2).sum()) In this case, the epsilon should be fixed, to avoid the influence of the changing distances on the epsilon determination inside of Rbf, which would spoil the whole thing. Of course, it are here two and not three other variables. I have an idea how to tune your model: Take, say, the half or three thirds of your simulation data as interpolation database, and try to reproduce the remaining part. I have some ideas how to tune using this in practice. Here, of course it are three quarters and not three thirds :-) This is a very good idea indeed: I am actually running out of test cases (it takes a while to run a simulation, and I need to do it every time I try a new combination of parameters to check if the interpolation is good enough or rubbish). I'll give it a go tomorrow at work and I'll report back (even if I get very bad results :-D ). I refined the idea a bit. Select one simulation, and use the complete rest as the interpolation base. Then repreat this for each simualation. Calculate some joint value for all the results, the simplest would maybe be, to calculate: def joint_ln_density(simulation_results, interpolation_results): return -((interpolation_results - simulation_results) ** 2) / (simulation_results ** 2) In fact, this calculates the logarithm of the Gaussians centered at *simulation_results* and taken at the obervations *interpolation_results*. It is the logarithms of the product of this Gaussians. The standard deviation of the Gaussians is assumed to be the value of the *simulation_results*, which means, that I assume that low-valued outcomes are much more precise in absolute numbers than high-outcome values, but /relative/ to their nominal value they are all the same precise. (NB: A scaling of the stddevs wouldn't make a significant difference /for you/. Same the neglected coefficients of the Gaussians.) I don't know, which method you like the most. Robert's and Kevin's proposals are hard to compete with ... You could optimise (maximise) the joint_ln_density outcome as a function of *epsilon* and the different scalings. afaic, scipy comes with some optimisation algorithms included. I checked it: http://docs.scipy.org/doc/scipy-0.7.x/reference/optimize.html#general-purpose I had planned to show some of the results based on the suggestion you gave me yesterday: I took two thirds ( :-D ) of the simulations database to use them as interpolation base and tried to reproduce the rest using the interpolation. Unfortunately it seems like my computer at work has blown up (maybe a BSOD, I was doing wy too many heavy things at once) and I can't access it from home at the moment. I can't show the real field profiles, but at least I can show you how good or bad the interpolation performs (in terms of relative errors), and I was planning to post a matplotlib composite graph to do just that. I am still hopeful my PC will resurrect at some point :-D However, from the first 100 or so interpolated simulations, I could gather these findings: 1) Interpolations on *cumulative* productions on oil and gas are extremely good, with a maximum range of relative error of -3% / +2%: most of them (95% more or less) show errors 1%, but for few of them I get the aforementioned range of errors in the interpolations; 2) Interpolations on oil and gas *rates* (time dependent), I usually get a -5% / +3% error per timestep, which is very good for my purposes. I still need to check these values but the first set of results were very promising; 3) Interpolations on gas injection (cumulative and rate) are a bit more shaky (15% error more or less), but this is essentially due to a particular complex behaviour of the reservoir simulator when it needs to decide (based on user input) if the gas is going to be re-injected, sold, used as excess gas and a few more; I am not that worried about this issue for the moment. I'll be off for a week for Easter (I'll leave for Greece in few hours), but I am really looking forward to try the suggestion Kevin posted and to investigate Robert's idea: I know about kriging, but I initially thought it wouldn't do a good job in this case. I'll reconsider for sure. And I'll post a screenshot of the results as soon as my PC get out of the Emergency Room. Thank you again! Andrea.
Re: [Numpy-discussion] Set values of a matrix within a specified range to zero
2010/3/30 Ryan May rma...@gmail.com: On Tue, Mar 30, 2010 at 11:12 AM, Alan G Isaac ais...@american.edu wrote: On 3/30/2010 12:56 PM, Sean Mulcahy wrote: 512x512 arrays. I would like to set elements of the array whose value fall within a specified range to zero (eg 23 x 45). x[(23x)*(x45)]=0 Or a version that seems a bit more obvious (doing a multiply between boolean arrays to get an AND operator seems a tad odd): x[(23x) (x45)] = 0 We recently found out that it executes faster using: x *= ((x = 23) | (x = 45)) . Friedrich ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Fourier transform
Le Mon, 29 Mar 2010 16:12:56 -0600, Charles R Harris charlesr.har...@gmail.com a écrit : On Mon, Mar 29, 2010 at 3:00 PM, Pascal pascal...@parois.net wrote: Hi, Does anyone have an idea how fft functions are implemented? Is it pure python? based on BLAS/LAPACK? or is it using fftw? I successfully used numpy.fft in 3D. I would like to know if I can calculate a specific a plane using the numpy.fft. I have in 3D: r(x, y, z)=\sum_h^N-1 \sum_k^M-1 \sum_l^O-1 f_{hkl} \exp(-2\pi \i (hx/N+ky/M+lz/O)) So for the plane, z is no longer independant. I need to solve the system: ax+by+cz+d=0 r(x, y, z)=\sum_h^N-1 \sum_k^M-1 \sum_l^O-1 f_{hkl} \exp(-2\pi \i (hx/N+ky/M+lz/O)) Do you think it's possible to use numpy.fft for this? I'm not clear on what you want to do here, but note that the term in the in the exponent is of the form k, x, i.e., the inner product of the vectors k and x. So if you rotate x by O so that the plane is defined by z = 0, then k, Ox = O.T, x. That is, you can apply the transpose of the rotation to the result of the fft. In other words, z is no longer independent but depends on x and y. Apparently, nobody is calculating the exact plane but they are making a slice in the 3D grid and doing some interpolation. However, your answer really help me on something completely different :) Thanks, Pascal ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Set values of a matrix within a specified range to zero
On Tue, Mar 30, 2010 at 3:16 PM, Friedrich Romstedt friedrichromst...@gmail.com wrote: 2010/3/30 Ryan May rma...@gmail.com: On Tue, Mar 30, 2010 at 11:12 AM, Alan G Isaac ais...@american.edu wrote: On 3/30/2010 12:56 PM, Sean Mulcahy wrote: 512x512 arrays. I would like to set elements of the array whose value fall within a specified range to zero (eg 23 x 45). x[(23x)*(x45)]=0 Or a version that seems a bit more obvious (doing a multiply between boolean arrays to get an AND operator seems a tad odd): x[(23x) (x45)] = 0 We recently found out that it executes faster using: x *= ((x = 23) | (x = 45)) . Interesting. In an ideal world, I'd love to see why exactly that is, because I don't think multiplication should be faster than a boolean op. If you need speed, then by all means go for it. But if you don't need speed I'd use the since that will be more obvious to the person who ends up reading your code later and has to spend time decoding what that line does. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Interpolation question
2010/3/30 Andrea Gavana andrea.gav...@gmail.com: However, from the first 100 or so interpolated simulations, I could gather these findings: 1) Interpolations on *cumulative* productions on oil and gas are extremely good, with a maximum range of relative error of -3% / +2%: most of them (95% more or less) show errors 1%, but for few of them I get the aforementioned range of errors in the interpolations; 2) Interpolations on oil and gas *rates* (time dependent), I usually get a -5% / +3% error per timestep, which is very good for my purposes. I still need to check these values but the first set of results were very promising; 3) Interpolations on gas injection (cumulative and rate) are a bit more shaky (15% error more or less), but this is essentially due to a particular complex behaviour of the reservoir simulator when it needs to decide (based on user input) if the gas is going to be re-injected, sold, used as excess gas and a few more; I am not that worried about this issue for the moment. Have a nice time in Greece, and what you write makes me laughing. :-) When you are back, you should maybe elaborate a bit on what gas injections, wells, re-injected gas and so on is, I don't know about it. Friedrich ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Set values of a matrix within a specified range to zero
On Tue, Mar 30, 2010 at 16:35, Ryan May rma...@gmail.com wrote: On Tue, Mar 30, 2010 at 3:16 PM, Friedrich Romstedt friedrichromst...@gmail.com wrote: x *= ((x = 23) | (x = 45)) . Interesting. In an ideal world, I'd love to see why exactly that is, because I don't think multiplication should be faster than a boolean op. Branch prediction failures are really costly in modern CPUs. http://en.wikipedia.org/wiki/Branch_prediction -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Set values of a matrix within a specified range to zero
2010/3/30 Ryan May rma...@gmail.com: On Tue, Mar 30, 2010 at 3:16 PM, Friedrich Romstedt friedrichromst...@gmail.com wrote: We recently found out that it executes faster using: x *= ((x = 23) | (x = 45)) . Interesting. In an ideal world, I'd love to see why exactly that is, because I don't think multiplication should be faster than a boolean op. If you need speed, then by all means go for it. But if you don't need speed I'd use the since that will be more obvious to the person who ends up reading your code later and has to spend time decoding what that line does. Hmm, I'm not familiar with numpy's internals, but I guess, it is because numpy doesn't have to evaluate the indexing boolean array? When the indexing array is applied to x via x[...] = 0? Robert's reply just came in when writing this. Friedrich ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Set values of a matrix within a specified range to zero
On Tue, Mar 30, 2010 at 3:40 PM, Robert Kern robert.k...@gmail.com wrote: On Tue, Mar 30, 2010 at 16:35, Ryan May rma...@gmail.com wrote: On Tue, Mar 30, 2010 at 3:16 PM, Friedrich Romstedt friedrichromst...@gmail.com wrote: x *= ((x = 23) | (x = 45)) . Interesting. In an ideal world, I'd love to see why exactly that is, because I don't think multiplication should be faster than a boolean op. Branch prediction failures are really costly in modern CPUs. http://en.wikipedia.org/wiki/Branch_prediction That makes sense. I still maintain that for 95% of code, easy to understand code is more important than performance differences due to branch misprediction. (And more importantly, we don't want to be teaching new users to code like that from the beginning.) Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion