On Tue, Jan 31, 2012 at 20:44, Jacob VanderPlas
<[email protected]> wrote:
> Hello,
> I've been working on applying Gaussian Processes to noisy input data.
> The scikit-learn docs are not especially helpful on this topic, but
> after reading through some of the references and scanning the code, I
> found that the keyword 'nugget' in the initializer of GaussianProcess
> does essentially what I want to do: add a diagonal term to the internal
> covariance matrix.
>
> The name 'nugget' does not immediately suggest this to me - does anyone
> know the origin of this naming convention?  Would it make more sense to
> rename "nugget" to "training_variance" or something similar?  At the
> very least, I plan to add some documentation and a small example showing
> how to perform GPML on noisy data.

It comes from the geostatistical community where Gaussian processes
show up under the name "kriging". The GP code in sklearn ultimately
derives from a "kriging" code and follows much of the terminology.
Kriging usually reformulates the covariance kernel as a variogram:

  http://en.wikipedia.org/wiki/Variogram

The variogram as a function of radius usually looks like a smoothish
curve starting at the "nugget" value near 0 (representing the
uncertainty of each individual measurement, or the uncertainty of a
point infinitisimally close to point X given the value at point X) and
increasing asymptotically closer to the "sill" value (the prior
variance over the whole domain, or the uncertainty of a point
infinitely far away from point X given the value at point X). The
variogram can be estimated from the data, and one can use empirical
variograms more or less directly. It's a bit trickier to back those
out to covariance kernels.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco

------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to