Re: [R] R and clinical studies
Thanks for the tip. I will look forward to trying this package out soon! Regards, -Cody Hans-Peter [EMAIL PROTECTED] m To [EMAIL PROTECTED] 03/23/2007 09:31 [EMAIL PROTECTED] AM cc Subject Re: [R] R and clinical studies Hi, 2007/3/20, [EMAIL PROTECTED] [EMAIL PROTECTED]: and (2) SAS seems to play nicer with MS products (e.g. PROC IMPORT seemed to read in messy Excel spreadsheets better than importData in Splus). (to pick one small detail from your post) You can use my xlsReadWrite package which will (on windows) read and write Excel data (- see CRAN, a new version is pending). While there is a pro version also, in lot of circumstances the free version is perfectly fine. -- Regards, Hans-Peter __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and clinical studies
Thank you to all those that responded to Delphine's original post on R and clinical studies. They have provided much food for thought. I had a couple of follow up questions/comments. Andrew is very correct in pointing out that there are classes and workshops available for R. It's my understanding that there are even commercial versions of R that now provide formal commercial-style courses. And at any rate, the money saved by potentially avoiding pricey software could certainly justify any training expense in time or money - this assumes of course that the pricey software could be dispensed with (I suspect that would take considerable time at my current company as so many legacy projects have been done in proprietary software). I still think that R provides less 'hand-holding' and requires more initiative (which may be more or less present on a per programmer/statistician basis). I guess one could always integrate R/Splus in with SAS, as Terry's group has done at Mayo - I will probably do this at least as a start. I have a few concerns with regards to this approach (these may be needless concerns, but I will venture expressing them anyway). First, I'm worried about the possibility of compatability concerns (will anyone be worried about a SAS dataset read into R or vice-versa?). Second, I would prefer focusing all my learning on one package if possible. I actually have more experience with SAS (as do others in my group), and if the switch to R is to be made I would like to make that switch as complete as possible. This would also avoid requiring new hires to know both languages. Third, if SAS is to be kept around, it defeats one of the main advantages of having open source code in the first place (R is wonderfully free!). Like Mayo, Baylor Health (my previous employer) used both Splus and SAS. I was warned that data manipulation would be much more difficult in R/Splus than it was in SAS. To be honest, and I say this humbly realizing that most posters to this list have much more experience than I, I haven't found data manipulation to be that much more difficult in R/Splus (at least as I have gained experience in R/Splus). I can think of two exceptions (1) large datasets and (2) SAS seems to play nicer with MS products (e.g. PROC IMPORT seemed to read in messy Excel spreadsheets better than importData in Splus). Is it possible (and I again say this with MUCH humility) that the perceived advantages of SAS with regards to data manipulation may be due in part to some users only using R/Splus for stat modeling and graphics (thus never becoming familiar with the data manipulation capabilities of R/Splus) or to the reluctance of SAS-trained individuals and companies to make the complete switch? Tony, the story about the famous software and the certain operating system at the large company was priceless. In closing, I should mention that in all posts I am speaking for myself and not for Edwards LifeSciences. Regards, -Cody __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and clinical studies
Thank you to all those that responded to Delphine's original post on R and clinical studies. They have provided much food for thought. I had a couple of follow up questions/comments. Andrew is very correct in pointing out that there are classes and workshops available for R. It's my understanding that there are even commercial versions of R that now provide formal commercial-style courses. And at any rate, the money saved by potentially avoiding pricey software could certainly justify any training expense in time or money - this assumes of course that the pricey software could be dispensed with (I suspect that would take considerable time at my current company as so many legacy projects have been done in proprietary software). I still think that R provides less 'hand-holding' and requires more initiative (which may be more or less present on a per programmer/statistician basis). I guess one could always integrate R/Splus in with SAS, as Terry's group has done at Mayo - I will probably do this at least as a start. I have a few concerns with regards to this approach (these may be needless concerns, but I will venture expressing them anyway). First, I'm worried about the possibility of compatability concerns (will anyone be worried about a SAS dataset read into R or vice-versa?). Second, I would prefer focusing all my learning on one package if possible. I actually have more experience with SAS (as do others in my group), and if the switch to R is to be made I would like to make that switch as complete as possible. This would also avoid requiring new hires to know both languages. Third, if SAS is to be kept around, it defeats one of the main advantages of having open source code in the first place (R is wonderfully free!). Like Mayo, Baylor Health (my previous employer) used both Splus and SAS. I was warned that data manipulation would be much more difficult in R/Splus than it was in SAS. To be honest, and I say this humbly realizing that most posters to this list have much more experience than I, I haven't found data manipulation to be that much more difficult in R/Splus (at least as I have gained experience in R/Splus). I can think of two exceptions (1) large datasets and (2) SAS seems to play nicer with MS products (e.g. PROC IMPORT seemed to read in messy Excel spreadsheets better than importData in Splus). Is it possible (and I again say this with MUCH humility) that the perceived advantages of SAS with regards to data manipulation may be due in part to some users only using R/Splus for stat modeling and graphics (thus never becoming familiar with the data manipulation capabilities of R/Splus) or to the reluctance of SAS-trained individuals and companies to make the complete switch? Tony, the story about the famous software and the certain operating system at the large company was priceless. In closing, I should mention that in all posts I am speaking for myself and not for Edwards LifeSciences. Regards, -Cody __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and clinical studies
[EMAIL PROTECTED] wrote: Thank you to all those that responded to Delphine's original post on R and clinical studies. They have provided much food for thought. I had a couple of follow up questions/comments. Andrew is very correct in pointing out that there are classes and workshops available for R. It's my understanding that there are even commercial versions of R that now provide formal commercial-style courses. And at any rate, the money saved by potentially avoiding pricey software could certainly justify any training expense in time or money - this assumes of course that the pricey software could be dispensed with (I suspect that would take considerable time at my current company as so many legacy projects have been done in proprietary software). I still think that R provides less 'hand-holding' and requires more initiative (which may be more or less present on a per programmer/statistician basis). I guess one could always integrate R/Splus in with SAS, as Terry's group has done at Mayo - I will probably do this at least as a start. I have a few concerns with regards to this approach (these may be needless concerns, but I will venture expressing them anyway). First, I'm worried about the possibility of compatability concerns (will anyone be worried about a SAS dataset read into R or vice-versa?). Second, I would prefer focusing all my learning on one package if possible. I actually have more experience with SAS (as do others in my group), and if the switch to R is to be made I would like to make that switch as complete as possible. This would also avoid requiring new hires to know both languages. Third, if SAS is to be kept around, it defeats one of the main advantages of having open source code in the first place (R is wonderfully free!). Like Mayo, Baylor Health (my previous employer) used both Splus and SAS. I was warned that data manipulation would be much more difficult in R/Splus than it was in SAS. To be honest, and I say this humbly realizing that most posters to this list have much more experience than I, I haven't found data manipulation to be that much more difficult in R/Splus (at least as I have gained experience in R/Splus). I can think of two exceptions (1) large datasets and (2) SAS seems to play nicer with MS products (e.g. PROC IMPORT seemed to read in messy Excel spreadsheets better than importData in Splus). Is it possible (and I again say this with MUCH humility) that the perceived advantages of SAS with regards to data manipulation may be due in part to some users only using R/Splus for stat modeling and graphics (thus never becoming familiar with the data manipulation capabilities of R/Splus) or to the reluctance of SAS-trained individuals and companies to make the complete switch? You are exactly correct on this point. Some graduate programs only teach students how to use R/S-Plus for modeling and graphics. R/S-Plus are wonderful for data manipulation - more powerful than SAS but not easy to learn (plus in R there are sometimes too many ways to do something; new users get lost - e.g. the reshape and reShape functions and the reshape package). http://biostat.mc.vanderbilt.edu/twiki/pub/Main/RS/sintro.pdf has many examples of complex data manipulation as do some web sites. We do analysis for pharmaceutical companies with 100% of the data manipulation done in R after importing say 50 SAS datasets into R. Doing tasks such as finding a lab value measured the closest in time to some event is much more elegant in R/S-Plus than in SAS. Frank Tony, the story about the famous software and the certain operating system at the large company was priceless. In closing, I should mention that in all posts I am speaking for myself and not for Edwards LifeSciences. Regards, -Cody __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and clinical studies
A strength of R is that there is a wide variety of contribuitions to the package, giving it great breadth. A weakness of R is that there is a wide variety of contributers to the package, some of whom spend a lot of time on the task of function correctness, and some of whom spend little; some worry about backward compatability, some sneer at the idea; some spend a lot of time on maintainance, and some don't have the time to do so or move on to other things. The survival code, for instance, has a set of exact test cases. These are small data sets where the correct answer has been carefully worked out by hand. S (Splus or R) passes all the tests, SAS passes most of them. (Most of the tests are documented in an appendix of Therneau and Grambsch, Springer, 2000). These test cases has been a great help in creating and debugging the code, but overall represent a large amount of work. Most code that does not have a corporate sponsor will not have the resources to do this. I have them mostly because the survival library's genesis has been spread out over 20 years, and individual bits were important parts of clinical trials and so HAD to be right. (Aside. SAS has a deserved repuation for accuracy. It has an undeserved one for infallability --- one of my favorite bug reports for the S code started out I've found a mistake in the coxph function, it gives a different answer than SAS. It turned out in that case that the S and SAS data sets in their example were not quite the same. As an earlier poster said, data management and manipulation is the root of most errors.) Our group uses SAS for data manipulation primarily, and a mix of SAS and S-Plus for the analysis. It would be difficult to become a pure S shop, but we've had no trouble with the mix. Terry Therneau Biostatistics, Mayo Clinic __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and clinical studies
Thank you to all those that responded to Delphine's original post on R and clinical studies. They have provided much food for thought. I had a couple of follow up questions/comments. Andrew is very correct in pointing out that there are classes and workshops available for R. It's my understanding that there are even commercial versions of R that now provide formal commercial-style courses. And at any rate, the money saved by potentially avoiding pricey software could certainly justify any training expense in time or money - this assumes of course that the pricey software could be dispensed with (I suspect that would take considerable time at my current company as so many legacy projects have been done in proprietary software). I still think that R provides less 'hand-holding' and requires more initiative (which may be more or less present on a per programmer/statistician basis). I guess one could always integrate R/Splus in with SAS, as Terry's group has done at Mayo - I will probably do this at least as a start. I have a few concerns with regards to this approach (these may be needless concerns, but I will venture expressing them anyway). First, I'm worried about the possibility of compatability concerns (will anyone be worried about a SAS dataset read into R or vice-versa?). Second, I would prefer focusing all my learning on one package if possible. I actually have more experience with SAS (as do others in my group), and if the switch to R is to be made I would like to make that switch as complete as possible. This would also avoid requiring new hires to know both languages. Third, if SAS is to be kept around, it defeats one of the main advantages of having open source code in the first place (R is wonderfully free!). Like Mayo, Baylor Health (my previous employer) used both Splus and SAS. I was warned that data manipulation would be much more difficult in R/Splus than it was in SAS. To be honest, and I say this humbly realizing that most posters to this list have much more experience than I, I haven't found data manipulation to be that much more difficult in R/Splus (at least as I have gained experience in R/Splus). I can think of two exceptions (1) large datasets and (2) SAS seems to play nicer with MS products (e.g. PROC IMPORT seemed to read in messy Excel spreadsheets better than importData in Splus). Is it possible (and I again say this with MUCH humility) that the perceived advantages of SAS with regards to data manipulation may be due in part to some users only using R/Splus for stat modeling and graphics (thus never becoming familiar with the data manipulation capabilities of R/Splus) or to the reluctance of SAS-trained individuals and companies to make the complete switch? Tony, the story about the famous software and the certain operating system at the large company was priceless. In closing, I should mention that in all posts I am speaking for myself and not for Edwards LifeSciences. Regards, -Cody __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and clinical studies
On Friday 16 March 2007 09:36, Delphine Fontaine wrote: Thanks for your answer which was very helpfull. I have another question: I have read in this document (http://cran.r-project.org/doc/manuals/R-intro.pdf) that most of the programs written in R are ephemeral and that new releases are not always compatible with previous releases. What I would like to know is if R functions are already validated and if not, what should we do to validate a R function ? Validation is in the eye of the beholder. In particular, for clinical studies, from the corporate or institutional point of view, what we should do to validate an R function should be answered by the local Standard Operating Procedures (SOPs) for what should we do to validate a computer programming language function. If you are working with clinical trials as part of a health authority submission process, you should have those in place. Of course, what you probably are interested in is an approach where you qualify R, and validate programs and packages written for R, which might be another better approach, in which case the same applies. Your SOPs should apply to both. (Now, assuming that you've done a reasonable job on the processes, as per Mats' answer, the point is that R vs. anything else is a simple red herring, as there is nothing in the spirit of the regulations which differentiates any of the characteristics of R with any other reasonable piece of software, for appropriate definitions of reasonableness). digression title=semi-relevant, on SOPs and commercial software I should point out that a certain large company I'm familiar with, who uses a certain famous piece of statistical software for activities perhaps described above, can't use the most recent version because of interesting issues with its self qualification tool, which prevents it from self-qualifying the new version on any installation on a certain operating system originating near where I used to live, when the previous version of the famous software had been installed. This feature, if not reverted, would necessitate total disk wipe of ALL computers requiring qualification running this operating system, where the new version of this famous piece of software would be installed, if this certain large company wants to follow it's SOPs. This is apparently a feature, not a bug, and demonstrates clearly the benefits and joys of commercial support when millions of swiss francs of licensing fees are involved. /digression I'm not a lawyer, nor am I speaking for any corporation indirectly referenced above, nor will I provide sufficient justification to help anyone else take any of the statements as a fact. best, -tony [EMAIL PROTECTED] Muttenz, Switzerland. Commit early,commit often, and commit in a repository from which we can easily roll-back your mistakes (AJR, 4Jan05). pgpykulSxmZWi.pgp Description: PGP signature __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and clinical studies
Thanks for your answer which was very helpfull. I have another question: I have read in this document (http://cran.r-project.org/doc/manuals/R-intro.pdf) that most of the programs written in R are ephemeral and that new releases are not always compatible with previous releases. What I would like to know is if R functions are already validated and if not, what should we do to validate a R function ? -- Delphine Fontaine Quoting Soukup, Mat [EMAIL PROTECTED]: Delphine, Please see the following message posted a week ago: http://comments.gmane.org/gmane.comp.lang.r.general/80175. HTH, -Mat -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Delphine Fontaine Sent: Friday, March 09, 2007 8:29 AM To: r-help@stat.math.ethz.ch Subject: [R] R and clinical studies Does anyone know if for clinical studies the FDA would accept statistical analyses performed with R ? Delphine Fontaine __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and clinical studies
Delphine Fontaine wrote: Thanks for your answer which was very helpfull. I have another question: I have read in this document (http://cran.r-project.org/doc/manuals/R-intro.pdf) that most of the programs written in R are ephemeral and that new releases are not always compatible with previous releases. What I would like to know is if R functions are already validated and if not, what should we do to validate a R function ? In the sense in which most persons use the term 'validate', it means to show with one or more datasets that the function is capable of producing the right answer. It doesn't mean that it produces the right answer for every dataset although we hope it does. [As an aside, most errors are in the data manipulation phase, not in the analysis phase.] So I think that instead of validating functions we should spend more effort on validating analyses [and validating analysis file derivation]. Pivotal analyses can be re-done a variety of ways, in R or in separate programmable packages such as Stata. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and clinical studies
I agree that most problems arise in the data management / file derivation phase. From my reading of 21 CFR 11, it appears that this document focuses primarily on data management (as well as on software directly involved in a medical device) rather than on validation of statistical functions. I believe this point has been made previously on the R-help list. With regards to validating functions, I have often wondered how one can validate a function when one cannot see what it is doing. You could certainly compare calculations from one package to the same calculations from another package, but then you must purchase (ouch!) and know how to properly use two software packages instead of one. And I suppose they could both be wrong! Is not peer-review the best form of validation? . . . I suspect I may be preaching to the choir here. I would love nothing more than to migrate our stat group over to R from SAS. Based on my experience with R/Splus, the language seems more extendable, flexible, and has much better graphics (as has been pointed out many times on this list). It also has available the many contributions of generous R users. However, it has been hard to win pure SAS users onto R (even if it saves the company money!). One can't send the biostat group off to R training like one would to SAS classes. Learning R requires initiative from the user (which is not necessarily a bad thing). I considered encouraging the purchase of Splus as an intermediate step (hoping that its proprietary nature would soothe fears regarding open source software), but that option was not as cheap as I thought. Regards, -Cody Delphine Fontaine wrote: Thanks for your answer which was very helpfull. I have another question: I have read in this document (http://cran.r-project.org/doc/manuals/R-intro.pdf) that most of the programs written in R are ephemeral and that new releases are not always compatible with previous releases. What I would like to know is if R functions are already validated and if not, what should we do to validate a R function ? In the sense in which most persons use the term 'validate', it means to show with one or more datasets that the function is capable of producing the right answer. It doesn't mean that it produces the right answer for every dataset although we hope it does. [As an aside, most errors are in the data manipulation phase, not in the analysis phase.] So I think that instead of validating functions we should spend more effort on validating analyses [and validating analysis file derivation]. Pivotal analyses can be re-done a variety of ways, in R or in separate programmable packages such as Stata. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R and clinical studies
Delphine, Please see the following message posted a week ago: http://comments.gmane.org/gmane.comp.lang.r.general/80175. HTH, -Mat -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Delphine Fontaine Sent: Friday, March 09, 2007 8:29 AM To: r-help@stat.math.ethz.ch Subject: [R] R and clinical studies Does anyone know if for clinical studies the FDA would accept statistical analyses performed with R ? Delphine Fontaine __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.