[Scikit-learn-general] High training error for small datasets

Kai Kuehne Fri, 22 Jun 2012 01:43:09 -0700

Hi,
I posted this question a few days ago on IRC shortly before my
internet connection broke down,
so sorry if you read this already.


I'm currently building a simple classification system and try to use
learning curves to check whether whether
my model suffers from high bias or high variance.
I (think I) followed the instructions on this page:
http://jakevdp.github.com/tutorial/astronomy/practical.html
So, if i understood this correctly, the training error should be small
for small training sets.
But, in my implementation and for my corpus, the training error starts
high: http://i.imgur.com/j4MNx.png
I calculate the error for every m like this: http://dpaste.com/761794/

Does anyone of you have any tips what I did wrong here?
Thank you!

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

[Scikit-learn-general] High training error for small datasets

Reply via email to