Re: [scikit-learn] suggested machine learning algorithm

Thomas Evangelidis Sun, 02 Oct 2016 05:25:51 -0700

On 1 October 2016 at 20:48, Алексей Драль <[email protected]> wrote:


> Hi Thomas,
>
> What quality do you have on training?
>
> There is no silver bullet, but there is quite common technique you can use
> to find out if you use appropriate algorithm. You can take a look at the
> difference between "train" and "validation" quality of learning curves (
> example
> <http://scikit-learn.org/stable/auto_examples/model_selection/plot_learning_curve.html#example-model-selection-plot-learning-curve-py>).
> If you see big gap, then you can reduce complexity of your model to
> overcome overfitting (reduce interaction parameter / number of variables
> / iterations / ...). If you see a small gap, then you can try to increase
> model complexity to fit your data better.
> 
>
> Hi Алексей,

the "Training examples" in the learning curves are  the number of
observations used for training? Don't you think my dataset is kind of small
(42 observations) to use that technique?



> Moreover, I see you have a tiny dataset and use 50/50 split. I presume,
> that you will train "production" model on the whole available dataset. In
> that case, I suggest you to use more data for training and use almost LOO
> <http://scikit-learn.org/stable/modules/cross_validation.html#leave-one-out-loo>
>  approach
> to better estimate your predictive quality. But, be really cautious about
> cross-validation as you can easily overfit your data.
>
>
>
>

_______________________________________________
scikit-learn mailing list
[email protected]
https://mail.python.org/mailman/listinfo/scikit-learn

Re: [scikit-learn] suggested machine learning algorithm

Reply via email to