> I think the situation is worse than messy. If a client comes in with data > that doesn't address the question they're interested in, I think they are > better served to be told that, than to be given an answer that is not > actually valid. They should also be told how to design a study that > actually does address their question. > > You (and others) have mentioned Google Analytics as a possible way to > address the quality of data; that's helpful. But analyzing bad data will > just give bad conclusions.
As long as we say 'package Foo is the most downloaded package on CRAN', and not 'package Foo is the most used package for R', we can leave it to the user to decide if the latter conclusion follows from the former. In the absence of actual usage data I would think it a good approximation. Not that I would risk my life on it. Pop music charts are now based on download counts, but I wouldn't believe they represent the songs that are listened to the most times. Nor would I go so far as to believe they represent the quality of the songs... Should R have a 'Would you like to tell CRAN every time you do library(foo) so we can do usage counts (no personal data is transmitted blah blah) ?'? I don't think so.... Barry ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.