> I want libraries that contain algorithms to check for relationships > within a dataset. For example, I want to parse through a SES dataset to > see any possible connections between student achievement and > socioeconomic standing, and correlate that to neighborhood wealth.
Ok, With that background I now return to your original question: > Can someone explain sklearns to me? I'm a novice at Python, > and I would like to use machine learning in my coding. > But aren't there libraries like matplotlib I can already > use? Why use sklearns? Starting at the end first... matplotlib is a plotting library, you give it some raw data and it plots a nice graphical image in any style you choose. Think of it like a programmatic version of the plotting feature in a spreadsheet. sklearn doesn't do that, it will generate the data for you to p[lot with matplotlib if you wish. (At least thats how I interpret the information on the sklearn web page.) So its not either/or - you need both. sklearn, as the sk in the name suggests, is part of SciKit which is a set of add-ons to SciPy, which includes matplotlib. What sklearn brings to the picture, again based on a very quick skim through the introductory material - is a framework for doing machine learning. If you just want to play with its standard datasets then its very easy to use. If you want to use it on your own data it gets harder - you need to format your data into the shape sklearn expects. You then need to specify/select or write the algorithms needed for sklearn to do its learning. Don't underestimate how much preparatory work you will need to do to feed the engine. Its not magic. For what you want, Pandas or Rpy might be able to do it just as easily - but since you don't seem to already know either of those then sklearn would seem to be a reasonable alternative/complementary choice. But if you don't know basic Python well that might be a bigger challenge. Given my level of ignorance about both sklearn and your problem, domain I can't say more than that. I would suggest asking again on the SciPy forum since you are likely to find a lot more people there who have experience of both - and alternatives like Pandas and Rpy. > And I know I should learn R, but I'm also learning > Python as my primary language now, and R isn't > really a programming language as Python, Java, It's not quite as general - I wouldn't try writing games or GUIs or web apps in R. But you can write fully self-contained applications in it if you wish. And for traditional statistical number crunching it's better than either Python or Java. Fortunately you can use R from either language via libraries. -- Alan G Author of the Learn to Program web site http://www.alan-g.me.uk/ http://www.amazon.com/author/alan_gauld Follow my photo-blog on Flickr at: http://www.flickr.com/photos/alangauldphotos _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor