On 20-Jun-10 19:07:21, Muenchen, Robert A (Bob) wrote: >>I wonder if there are any capture-recapture type methodologies for >>estimating open-source software usage? Another idea would be to >>combine with some other known numbers, e.g. book sales, conference >>attendance etc. You'd need personal information to link the data sets >>together. >> >>Hadley > > This totally cracked me up! I'm envisioning going into one of our > computer labs, tossing a net over an unsuspecting student, and then > tagging their ear with a code that represents which stat package > they're using. Then release and later recapture. What percent did > we get? That's what the profs I deal with do with animals to estimate > populations.
I've given thought in the past to the question of estimating the R user base, and came to the conclusion that it is impossible to get an estimate of the number of users that one could trust (or even put anything like a margin of error to). I think one could get a number which represented a moderately informative lower bound -- just count the number of different email addresses that have ever posted to the R-help list. This will of course include people who post (or have posted) from more than one email address, and people who tried R for a while and then dropped it, but my feeling is that these are likely to be outweighed by the number of people who have used R but have never posted (for example students who are getting their R help from their instructors, people using R in a corporate context who are discouraged from posting to public lists, etc.). The number of subscribers to R-help (currently about 10200) is a definite lower bound for the number of R users, but many users post to R-help without being subscribed. I would expect that the total number of different email addresses that have posted to R-help would be considerably larger than 10200. I don't think a "Mark-Recapture" approach is feasible. Further, I don't know how one might take account of the fact that some installations of R (e.g. on a corporate or institutional or departmental server) may each be used by several users. Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <ted.hard...@manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 20-Jun-10 Time: 20:41:43 ------------------------------ XFMail ------------------------------ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.