Hi list,

Indeed, I did not think about this usage of the GP predictor (actually I 
don't think DACE for Matlab handles this case either). In my opinion, 
fitting a GP with only one point does not make much sense even if it 
holds mathematically (i.e. you can compute the posterior distribution of 
a two-dimensional Gaussian random vector given its mean and covariance). 
Nonetheless, as Kenneth pointed it out, the maximum likelihood estimate 
for the variance sigma2 and the correlation parameter theta is ill-posed 
in this case.
Hence, I would rather raise an Exception.

@Kenneth: Conditionnal and unconditionnal random draws were already 
submitted to the project in a pull request 
(https://github.com/scikit-learn/scikit-learn/pull/97) by demianw 
although his code has never been merged.

@AlexP: What are you trying to do with this iterative construction? Are 
you trying to implement some optimization algorithm (like the efficient 
global optimizer by Jones etal [1])? If so, note that Jones' "expected 
improvement" starts being objective only as the dataset starts being "a 
bit" dense. Starting from one point only is definitely not a good idea 
(and adding points sequentially is not a so good idea either...).

Cheers,
Vincent

[1] 
http://www.ressources-actuarielles.net/EXT/ISFA/1226.nsf/8d48b7680058e977c1256d65003ecbb5/f84f7ac703bf5862c12576d8002f5259/$FILE/Jones98.pdf

On 29/11/2011 21:28, Kenneth C. Arnold wrote:
> There is no maximum likelihood solution to a GP with a single training
> point, but you can certainly draw samples from the posterior; in fact,
> you can draw samples from the prior (without conditioning on data).
> That may help you determine if your covariance function is reasonable:
> samples from the prior should look like data that you might expect to
> see.
>
> I'm unfamiliar with the sklearn implementation of GP, but I put some
> MATLAB code demonstrating unconditioned and conditioned draws (using
> the ordinary squared exponential covariance function) at
> https://gist.github.com/1406331.
>
> See chapter 2 of the Rasmussen and Williams book (iirc) for details.
>
> -Ken
>
>
>
> On Tue, Nov 29, 2011 at 3:07 PM, Vlad Niculae<[email protected]>  wrote:
>> On Tue, Nov 29, 2011 at 10:02 PM, Alexandre Gramfort
>> <[email protected]>  wrote:
>>> Hi Alex,
>>>
>>> I would say:
>>>
>>> if it makes sense to fit a GP with only one point:
>>>     it should be fixed
>> Note that even though it might not make any sense in practice, unless
>> there's a mathematical reason that I'm missing, it shouldn't be
>> prohibited, if only for didactical purposes, in my opinion.
>>
>> Vlad
>>
>>> else:
>>>     raise a nicer error message
>>>
>>> Alex
>>>
>>> On Tue, Nov 29, 2011 at 7:10 PM, Alexandre Passos
>>> <[email protected]>  wrote:
>>>> Hi,
>>>>
>>>> Currently the fit function in GaussianProcess throws a weird exception
>>>> when only one training example is passed to fit():
>>>>
>>>>>>> from sklearn.gaussian_process import GaussianProcess
>>>> from sklearn.gaussian_process import GaussianProcess
>>>>>>> gp.fit([[1., 2.]], [-1.0])
>>>> gp.fit([[1., 2.]], [-1.0])
>>>> Traceback (most recent call last):
>>>>   File "<stdin>", line 1, in<module>
>>>>   File 
>>>> "/Users/apassos/Library/Python/2.7/lib/python/site-packages/sklearn/gaussian_process/gaussian_process.py",
>>>> line 281, in fit
>>>>     if np.min(np.sum(D, axis=1)) == 0. \
>>>>   File 
>>>> "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/core/fromnumeric.py",
>>>> line 1862, in amin
>>>>     return amin(axis, out)
>>>> ValueError: zero-size array to ufunc.reduce without identity
>>>> Should this be fixed or should a better error message be passed?
>>>> --
>>>>   - Alexandre
>>>>
>>>> ------------------------------------------------------------------------------
>>>> All the data continuously generated in your IT infrastructure
>>>> contains a definitive record of customers, application performance,
>>>> security threats, fraudulent activity, and more. Splunk takes this
>>>> data and makes sense of it. IT sense. And common sense.
>>>> http://p.sf.net/sfu/splunk-novd2d
>>>> _______________________________________________
>>>> Scikit-learn-general mailing list
>>>> [email protected]
>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>>
>>> ------------------------------------------------------------------------------
>>> All the data continuously generated in your IT infrastructure
>>> contains a definitive record of customers, application performance,
>>> security threats, fraudulent activity, and more. Splunk takes this
>>> data and makes sense of it. IT sense. And common sense.
>>> http://p.sf.net/sfu/splunk-novd2d
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>> ------------------------------------------------------------------------------
>> All the data continuously generated in your IT infrastructure
>> contains a definitive record of customers, application performance,
>> security threats, fraudulent activity, and more. Splunk takes this
>> data and makes sense of it. IT sense. And common sense.
>> http://p.sf.net/sfu/splunk-novd2d
>> _______________________________________________
>> Scikit-learn-general mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> ------------------------------------------------------------------------------
> All the data continuously generated in your IT infrastructure
> contains a definitive record of customers, application performance,
> security threats, fraudulent activity, and more. Splunk takes this
> data and makes sense of it. IT sense. And common sense.
> http://p.sf.net/sfu/splunk-novd2d
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure 
contains a definitive record of customers, application performance, 
security threats, fraudulent activity, and more. Splunk takes this 
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to