2012/6/22 Kai Kuehne <[email protected]>: > Hi, > I posted this question a few days ago on IRC shortly before my > internet connection broke down, > so sorry if you read this already. > > I'm currently building a simple classification system and try to use > learning curves to check whether whether > my model suffers from high bias or high variance. > I (think I) followed the instructions on this page: > http://jakevdp.github.com/tutorial/astronomy/practical.html > So, if i understood this correctly, the training error should be small > for small training sets. > But, in my implementation and for my corpus, the training error starts > high: http://i.imgur.com/j4MNx.png > I calculate the error for every m like this: http://dpaste.com/761794/
Maybe the machine learning algorithm stops before reaching actual convergence? What kind of data are you using? what dimensions? what type of model and what parameters are you using? Here is an alternative implementation of the learning curves: https://gist.github.com/1540431 They behave as expected in this case. -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
