2011/11/29 Kenneth C. Arnold <[email protected]>:
> On Tue, Nov 29, 2011 at 4:53 PM, Olivier Grisel
> <[email protected]> wrote:
>> Now back to you problem I think we should support fitting models with
>> just one sample just for the sake of consistency / continuity even if
>> theds is no practical application of fitting models with a single
>> sample: fitting models  with 2 samples would be almost as stupid as
>> fitting a model with only one sample and there is no principled or
>> natural, pre-determined threshold I know of that would give us the
>> minimum number of samples to provide to an estimator.
>>
>> IMHO this is a bug. GaussianProcess and other scikit-learn estimators
>> should accept to fit with singleton training sets and provide
>> predictions that are mathematically consistent even if useless in
>> practice.
>
> I misspoke earlier: the MLE for a GP conditioned on a single point is
> just the value at that point, just as the maximum likelihood predictor
> for a Gaussian fit to one data point is that data point. (The variance
> is indeed ill-posed, but the prediction is just the mean.)

That makes sense. Fortunately we don't have an API to compute the
expected variance of a prediction :)

> https://github.com/scikit-learn/scikit-learn/pull/97 looks like
> activity fizzled right as it was about ready to merge. What's the
> status? [Yes, I'm cautiously expressing and gauging interest without
> implicitly promising work.]

Indeed this pull request good forgotten and need a champion to revive
it: upgrade it to the current status of the master and give a status
of the pending points that were raised in the previous comments, make
sure that the documentation is up to date and that the test pass with
a good coverage.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure 
contains a definitive record of customers, application performance, 
security threats, fraudulent activity, and more. Splunk takes this 
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to