Hi list, Indeed, I did not think about this usage of the GP predictor (actually I don't think DACE for Matlab handles this case either). In my opinion, fitting a GP with only one point does not make much sense even if it holds mathematically (i.e. you can compute the posterior distribution of a two-dimensional Gaussian random vector given its mean and covariance). Nonetheless, as Kenneth pointed it out, the maximum likelihood estimate for the variance sigma2 and the correlation parameter theta is ill-posed in this case. Hence, I would rather raise an Exception.
@Kenneth: Conditionnal and unconditionnal random draws were already submitted to the project in a pull request (https://github.com/scikit-learn/scikit-learn/pull/97) by demianw although his code has never been merged. @AlexP: What are you trying to do with this iterative construction? Are you trying to implement some optimization algorithm (like the efficient global optimizer by Jones etal [1])? If so, note that Jones' "expected improvement" starts being objective only as the dataset starts being "a bit" dense. Starting from one point only is definitely not a good idea (and adding points sequentially is not a so good idea either...). Cheers, Vincent [1] http://www.ressources-actuarielles.net/EXT/ISFA/1226.nsf/8d48b7680058e977c1256d65003ecbb5/f84f7ac703bf5862c12576d8002f5259/$FILE/Jones98.pdf On 29/11/2011 21:28, Kenneth C. Arnold wrote: > There is no maximum likelihood solution to a GP with a single training > point, but you can certainly draw samples from the posterior; in fact, > you can draw samples from the prior (without conditioning on data). > That may help you determine if your covariance function is reasonable: > samples from the prior should look like data that you might expect to > see. > > I'm unfamiliar with the sklearn implementation of GP, but I put some > MATLAB code demonstrating unconditioned and conditioned draws (using > the ordinary squared exponential covariance function) at > https://gist.github.com/1406331. > > See chapter 2 of the Rasmussen and Williams book (iirc) for details. > > -Ken > > > > On Tue, Nov 29, 2011 at 3:07 PM, Vlad Niculae<[email protected]> wrote: >> On Tue, Nov 29, 2011 at 10:02 PM, Alexandre Gramfort >> <[email protected]> wrote: >>> Hi Alex, >>> >>> I would say: >>> >>> if it makes sense to fit a GP with only one point: >>> it should be fixed >> Note that even though it might not make any sense in practice, unless >> there's a mathematical reason that I'm missing, it shouldn't be >> prohibited, if only for didactical purposes, in my opinion. >> >> Vlad >> >>> else: >>> raise a nicer error message >>> >>> Alex >>> >>> On Tue, Nov 29, 2011 at 7:10 PM, Alexandre Passos >>> <[email protected]> wrote: >>>> Hi, >>>> >>>> Currently the fit function in GaussianProcess throws a weird exception >>>> when only one training example is passed to fit(): >>>> >>>>>>> from sklearn.gaussian_process import GaussianProcess >>>> from sklearn.gaussian_process import GaussianProcess >>>>>>> gp.fit([[1., 2.]], [-1.0]) >>>> gp.fit([[1., 2.]], [-1.0]) >>>> Traceback (most recent call last): >>>> File "<stdin>", line 1, in<module> >>>> File >>>> "/Users/apassos/Library/Python/2.7/lib/python/site-packages/sklearn/gaussian_process/gaussian_process.py", >>>> line 281, in fit >>>> if np.min(np.sum(D, axis=1)) == 0. \ >>>> File >>>> "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/core/fromnumeric.py", >>>> line 1862, in amin >>>> return amin(axis, out) >>>> ValueError: zero-size array to ufunc.reduce without identity >>>> Should this be fixed or should a better error message be passed? >>>> -- >>>> - Alexandre >>>> >>>> ------------------------------------------------------------------------------ >>>> All the data continuously generated in your IT infrastructure >>>> contains a definitive record of customers, application performance, >>>> security threats, fraudulent activity, and more. Splunk takes this >>>> data and makes sense of it. IT sense. And common sense. >>>> http://p.sf.net/sfu/splunk-novd2d >>>> _______________________________________________ >>>> Scikit-learn-general mailing list >>>> [email protected] >>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >>>> >>> ------------------------------------------------------------------------------ >>> All the data continuously generated in your IT infrastructure >>> contains a definitive record of customers, application performance, >>> security threats, fraudulent activity, and more. Splunk takes this >>> data and makes sense of it. IT sense. And common sense. >>> http://p.sf.net/sfu/splunk-novd2d >>> _______________________________________________ >>> Scikit-learn-general mailing list >>> [email protected] >>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general >> ------------------------------------------------------------------------------ >> All the data continuously generated in your IT infrastructure >> contains a definitive record of customers, application performance, >> security threats, fraudulent activity, and more. Splunk takes this >> data and makes sense of it. IT sense. And common sense. >> http://p.sf.net/sfu/splunk-novd2d >> _______________________________________________ >> Scikit-learn-general mailing list >> [email protected] >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure > contains a definitive record of customers, application performance, > security threats, fraudulent activity, and more. Splunk takes this > data and makes sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-novd2d > _______________________________________________ > Scikit-learn-general mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-novd2d _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
