On Tue, Jan 31, 2012 at 20:44, Jacob VanderPlas <[email protected]> wrote: > Hello, > I've been working on applying Gaussian Processes to noisy input data. > The scikit-learn docs are not especially helpful on this topic, but > after reading through some of the references and scanning the code, I > found that the keyword 'nugget' in the initializer of GaussianProcess > does essentially what I want to do: add a diagonal term to the internal > covariance matrix. > > The name 'nugget' does not immediately suggest this to me - does anyone > know the origin of this naming convention? Would it make more sense to > rename "nugget" to "training_variance" or something similar? At the > very least, I plan to add some documentation and a small example showing > how to perform GPML on noisy data.
It comes from the geostatistical community where Gaussian processes show up under the name "kriging". The GP code in sklearn ultimately derives from a "kriging" code and follows much of the terminology. Kriging usually reformulates the covariance kernel as a variogram: http://en.wikipedia.org/wiki/Variogram The variogram as a function of radius usually looks like a smoothish curve starting at the "nugget" value near 0 (representing the uncertainty of each individual measurement, or the uncertainty of a point infinitisimally close to point X given the value at point X) and increasing asymptotically closer to the "sill" value (the prior variance over the whole domain, or the uncertainty of a point infinitely far away from point X given the value at point X). The variogram can be estimated from the data, and one can use empirical variograms more or less directly. It's a bit trickier to back those out to covariance kernels. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco ------------------------------------------------------------------------------ Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
