Dear R Developers,

This is my first time subscribing to this list, so let me start out by saying 
thank you all very much for the incredible contribution you have made to 
science through your work on R. 

As you all know many users of commercial stat packages, their managers, 
directors, CIOs etc. are skeptical of R's quality/accuracy. And as the recent 
NY Times article demonstrated, the commercial vendors rarely miss an 
opportunity stoke those fears. I have read many r-help posts on this subject so 
I was aware that R was developed and tested with great care, but until I read 
the clinical trials doc, I was not aware that they were as many steps and that 
they were as rigorous (use of version management software, etc.) Even as I read 
the document, the opening paragraphs made me think it was far too focused to be 
of general use. Luckily, I kept reading through the CFRs. Modifying that doc 
would take little effort as I outlined in my original post (below). Putting it 
in easy reach of every R user is important. By adding that to the docs in R's 
Help menu, and adding a FAQ entry for it, all R users will have ready access to 
it. 

My second suggestion is adding an option to the R installation that would let 
every R user run the test suite, very clearly showing them that it is being 
done. I realize this is a superfluous step, since you have already run the test 
suite against R before releasing it. However, it would provide user assurance 
that they could easily demonstrate to skeptics that very thorough testing is 
being done. I don't know whether written messages that I suggested below would 
be best, or simply showing the output scrolling by would have the most impact. 
Perhaps both, as in a message "Testing accuracy of linear regression..." in a 
message window while the output scrolled by in the console.

Rather than having this as a part of installation, an alternative would be to 
end the installation with a message pointing people to a function like 
validate.R() and an equivalent menu selection as a following step. That would 
ensure that everyone knows the option exists, plus it enables any R user to run 
the tests for skeptics at any time. The easier it is to run the test suite the 
better. 

The complete set of validation programs you use may be huge and impractical for 
most people to run. In that case, perhaps just a subset could be compiled, with 
an emphasis on testing the common statistical functions that people are likely 
to focus their concern upon.

Asking to add a superfluous step to an installation may seem like a waste of 
time, and technically it is. But psychologically this testing will have a 
important impact that will silence many critics. Thanks for taking the time to 
consider it.

Best regards,
Bob


-----Original Message-----
From: Peter Dalgaard [mailto:p.dalga...@biostat.ku.dk] 
Sent: Saturday, January 24, 2009 4:53 AM
To: Muenchen, Robert A (Bob)
Cc: r-h...@r-project.org
Subject: Re: [R] The Quality & Accuracy of R

Bob,

Your point is well taken, but it also raises a number of issues 
(post-install testing to name one) for which the R-devel list would be 
more suitable. Could we move the discussion there?

        -Peter


Muenchen, Robert A (Bob) wrote:
> Hi All,
> 
>  
> 
> We have all had to face skeptical colleagues asking if software made by
> volunteers could match the quality and accuracy of commercially written
> software. Thanks to the prompting of a recent R-help thread, I read, "R:
> Regulatory Compliance and Validation Issues, A Guidance Document for the
> Use of R in Regulated Clinical Trial Environments
> (http://www.r-project.org/doc/R-FDA.pdf). This is an important document,
> of interest to the general R community. The question of R's accuracy is
> such a frequent one, it would be beneficial to increase the visibility
> of the non-clinical  information it contains. A document aimed at a
> general audience, entitled something like, "R: Controlling Quality and
> Assuring Accuracy" could be compiled from the these sections:
> 
>  
> 
> 1.      What is R? (section 4)
> 
> 2.      The R Foundation for Statistical Computing (section  3)
> 
> 3.      The Scope of this Guidance Document (section 2)
> 
> 4.      Software Development Life Cycle (section 6)
> 
>  
> 
> Marc Schwartz, Frank Harrell, Anthony Rossini, Ian Francis and others
> did such a great job that very few words would need to change. The only
> addition I suggest is to mention how well R did in, Keeling & Parvur's
> "A comparative study of the reliability to nine statistical software
> packages, May 1, 2007 Computational Statistics & Data Analysis, Vol.51,
> pp 3811-3831. 
> 
>  
> 
> Given the importance of this issue, I would like to see such a document
> added to the PDF manuals in R's Help.
> 
>  
> 
> The document mentions (Sect. 6.3) that a set of validation tests, data
> and known results are available. It would be useful to have an option to
> run that test suite in every R installation, providing clear progress,
> "Validating accuracy of t-tests...Validating accuracy of linear
> regression...." Whether or not people chose to run the tests, they would
> at least know that such tests are available. Back in my mainframe
> installation days, this step was part of many software installations and
> it certainly gave the impression that those were the companies that took
> accuracy seriously. Of course the other companies probably just ran
> their validation suite before shipping, but seeing it happen had a
> tremendous impact.  I don't know how much this would add to download,
> but if it was too much, perhaps it could be implemented as a separate
> download. 
> 
>  
> 
> I hope these suggestions can help mitigate the concerns so many non-R
> users have.
> 
>  
> 
> Cheers,
> 
> Bob
> 
>  
> 
> =========================================================
> 
> Bob Muenchen (pronounced Min'-chen), 
> 
> Manager, Research Computing Support 
> 
> U of TN Office of Information Technology
> 
> Stokely Management Center, Suite 200
> 
> 916 Volunteer Blvd., Knoxville, TN 37996-0520
> 
> Voice: (865) 974-5230
> 
> FAX: (865) 974-4810
> 
> Email: muenc...@utk.edu
> 
> Web: http://oit.utk.edu/research <http://oit.utk.edu/scc> 
> 
> Map to Office: http://www.utk.edu/maps    
> 
> Newsletter: http://listserv.utk.edu/archives/rcnews.html
> <http://listserv.utk.edu/archives/statnews.html> 
> 
> =========================================================
> 
>  
> 
> 
>       [[alternative HTML version deleted]]
> 
> ______________________________________________
> r-h...@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
    O__  ---- Peter Dalgaard             Ă˜ster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalga...@biostat.ku.dk)              FAX: (+45) 35327907

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to