Karl had been asking about how to estimate population statistics from a sample.

I came across a fascinating approach to this, called bootstrapping or the 
bootstrap method. I learned about this an EdX data science course offered by 
Berkeley.

The method is described in the course's free online textbook:
  https://inferentialthinking.com/chapters/13/2/Bootstrap.html

This method makes some assumptions, including that your sample is reasonably 
large, and that the population distribution is approximately normal.

The basic approach is to take your sample, and then randomly re-sample from the 
sample. This lets you build up a probability distribution of samples which, in 
turn, is representative of the population.

The text includes some worked examples. The course uses Python (Jupyter 
notebooks) and a computational framework based on Pandas.

The course sequence: 
https://www.edx.org/professional-certificate/berkeleyx-foundations-of-data-science

It's the second course, "Inferential thinking through simulations," which 
introduces and builds on the bootstrap concept. Lots of the materials are free 
- including the computational framework and examples. I'm not sure whether you 
can audit the (self-paced) course for free.

I found the bootstrap method to be very interesting. It is not something that 
came up in my many grad and undergrad statistics courses or other research 
methods (maybe it emerged since I was a university student).

I hope this helps.
  Greg

Reply via email to