List folk, I am a newbie trying to get used to Python. I was wondering if anyone knows of web resources that teach good practices in data cleaning and management for statistics/analytics/machine learning, particularly using Python.
Ideally, these would be exercises of the form: here is some horrible raw data --> here is what it should look like after it has been cleaned. Guidelines about steps that should always be taken, practices that should be avoided; basically, workflow of data analysis in Python with special emphasis on the cleaning part. -- http://mail.python.org/mailman/listinfo/python-list