> Date: Sun, 28 Mar 2010 00:24:01 +0000 > From: Andrea Gavana <[email protected]> > Subject: [Numpy-discussion] Interpolation question > To: Discussion of Numerical Python <[email protected]> > Message-ID: > <[email protected]> > Content-Type: text/plain; charset=ISO-8859-1 > > Hi All, > > I have an interpolation problem and I am having some difficulties > in tackling it. I hope I can explain myself clearly enough. > > Basically, I have a whole bunch of 3D fluid flow simulations (close to > 1000), and they are a result of different combinations of parameters. > I was planning to use the Radial Basis Functions in scipy, but for the > moment let's assume, to simplify things, that I am dealing only with > one parameter (x). In 1000 simulations, this parameter x has 1000 > values, obviously. The problem is, the outcome of every single > simulation is a vector of oil production over time (let's say 40 > values per simulation, one per year), and I would like to be able to > interpolate my x parameter (1000 values) against all the simulations > (1000x40) and get an approximating function that, given another x > parameter (of size 1x1) will give me back an interpolated production > profile (of size 1x40).
Andrea, may I suggest a different approach to RBF's. Realize that your vector of 40 values for each row in y are not independent of each other (they will be correlated). First perform a principal component analysis on this 1000 x 40 matrix and reduce it down to a 1000 x A matrix, called your scores matrix, where A is the number of independent components. A is selected so that it adequately summarizes Y without over-fitting and you will find A << 40, maybe 2 or 3. There are tools, such as cross-validation, that do this well enough. Then you can relate your single column of X to these independent column in A using a tool such as least squares: one least squares model per column in the scores matrix. This works because each column in the score vector is independent (contains totally orthogonal information) to the others. But I would be surprised if this works well enough, unless A = 1. But it sounds like your don't just have a single column in you X-variables (you hinted that the single column was just for simplification). In that case, I would build a projection to latent structures model (PLS) model that builds a single latent-variable model that simultaneously models the X-matrix, the Y-matrix as well as providing the maximal covariance between these two matrices. > Something along these lines: > > import numpy as np > from scipy.interpolate import Rbf > > # x.shape = (1000, 1) > # y.shape = (1000, 40) > > rbf = Rbf(x, y) > > # New result with xi.shape = (1, 1) --> fi.shape = (1, 40) > fi = rbf(xi) > > > Does anyone have a suggestion on how I could implement this? Sorry if > it sounds confused... Please feel free to correct any wrong > assumptions I have made, or to propose other approaches if you think > RBFs are not suitable for this kind of problems. > > Thank you in advance for your suggestions. > > Andrea. > > "Imagination Is The Only Weapon In The War Against Reality." > http://xoomer.alice.it/infinity77/ > > ==> Never *EVER* use RemovalGroup for your house removal. You'll > regret it forever. > http://thedoomedcity.blogspot.com/2010/03/removal-group-nightmare.html <== _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
