Re: [R] R and clinical studies

2007-03-23 Thread Cody_Hamilton

Thanks for the tip.  I will look forward to trying this package out soon!

Regards, -Cody



   
 Hans-Peter
 [EMAIL PROTECTED] 
 m To 
   [EMAIL PROTECTED] 
 03/23/2007 09:31  [EMAIL PROTECTED] 
 AM cc 
   
   Subject 
   Re: [R] R and clinical studies  
   
   
   
   
   
   




Hi,

2007/3/20, [EMAIL PROTECTED] [EMAIL PROTECTED]:

 and (2) SAS seems to play nicer with MS products (e.g. PROC IMPORT seemed
 to read in messy Excel spreadsheets better than importData in Splus).

(to pick one small detail from your post)

You can use my xlsReadWrite package which will (on windows) read and
write Excel data (- see CRAN, a new version is pending). While there
is a pro version also, in lot of circumstances the free version is
perfectly fine.


--
Regards,
Hans-Peter

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and clinical studies

2007-03-20 Thread Cody_Hamilton

Thank you to all those that responded to Delphine's original post on R and
clinical studies.  They have provided much food for thought.

I had a couple of follow up questions/comments.  Andrew is very correct in
pointing out that there are classes and workshops available for R.  It's my
understanding that there are even commercial versions of R that now provide
formal commercial-style courses.  And at any rate, the money saved by
potentially avoiding pricey software could certainly justify any training
expense in time or money  - this assumes of course that the pricey software
could be dispensed with (I suspect that would take considerable time at my
current company as so many legacy projects have been done in proprietary
software).  I still think that R provides less 'hand-holding' and requires
more initiative (which may be more or less present on a per
programmer/statistician basis).

I guess one could always integrate R/Splus in with SAS, as Terry's group
has done at Mayo - I will probably do this at least as a start.  I have a
few concerns with regards to this approach (these may be needless concerns,
but I will venture expressing them anyway).  First, I'm worried about the
possibility of compatability concerns (will anyone be worried about a SAS
dataset read into R or vice-versa?).  Second, I would prefer focusing all
my learning on one package if possible.  I actually have more experience
with SAS (as do others in my group), and if the switch to R is to be made I
would like to make that switch as complete as possible.   This would also
avoid requiring new hires to know both languages.  Third, if SAS is to be
kept around, it defeats one of the main advantages of having open source
code in the first place (R is wonderfully free!).  Like Mayo, Baylor Health
(my previous employer) used both Splus and SAS.  I was warned that data
manipulation would be much more difficult in R/Splus than it was in SAS.
To be honest, and I say this humbly realizing that most posters to this
list have much more experience than I, I haven't found data manipulation to
be that much more difficult in R/Splus (at least as I have gained
experience in R/Splus).   I can think of two exceptions (1) large datasets
and (2) SAS seems to play nicer with MS products (e.g. PROC IMPORT seemed
to read in messy Excel spreadsheets better than importData in Splus).  Is
it possible (and I again say this with MUCH humility) that the perceived
advantages of SAS with regards to data manipulation may be due in part to
some users only using R/Splus for stat modeling and graphics (thus never
becoming familiar with the data manipulation capabilities of R/Splus) or to
the reluctance of SAS-trained individuals and companies to make the
complete switch?

Tony, the story about the famous software and the certain operating
system at the large company was priceless.

In closing, I should mention that in all posts I am speaking for myself and
not for Edwards LifeSciences.

Regards,
-Cody

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and clinical studies

2007-03-20 Thread Cody_Hamilton


Thank you to all those that responded to Delphine's original post on R and
clinical studies.  They have provided much food for thought.

I had a couple of follow up questions/comments.  Andrew is very correct in
pointing out that there are classes and workshops available for R.  It's my
understanding that there are even commercial versions of R that now provide
formal commercial-style courses.  And at any rate, the money saved by
potentially avoiding pricey software could certainly justify any training
expense in time or money  - this assumes of course that the pricey software
could be dispensed with (I suspect that would take considerable time at my
current company as so many legacy projects have been done in proprietary
software).  I still think that R provides less 'hand-holding' and requires
more initiative (which may be more or less present on a per
programmer/statistician basis).

I guess one could always integrate R/Splus in with SAS, as Terry's group
has done at Mayo - I will probably do this at least as a start.  I have a
few concerns with regards to this approach (these may be needless concerns,
but I will venture expressing them anyway).  First, I'm worried about the
possibility of compatability concerns (will anyone be worried about a SAS
dataset read into R or vice-versa?).  Second, I would prefer focusing all
my learning on one package if possible.  I actually have more experience
with SAS (as do others in my group), and if the switch to R is to be made I
would like to make that switch as complete as possible.   This would also
avoid requiring new hires to know both languages.  Third, if SAS is to be
kept around, it defeats one of the main advantages of having open source
code in the first place (R is wonderfully free!).  Like Mayo, Baylor Health
(my previous employer) used both Splus and SAS.  I was warned that data
manipulation would be much more difficult in R/Splus than it was in SAS.
To be honest, and I say this humbly realizing that most posters to this
list have much more experience than I, I haven't found data manipulation to
be that much more difficult in R/Splus (at least as I have gained
experience in R/Splus).   I can think of two exceptions (1) large datasets
and (2) SAS seems to play nicer with MS products (e.g. PROC IMPORT seemed
to read in messy Excel spreadsheets better than importData in Splus).  Is
it possible (and I again say this with MUCH humility) that the perceived
advantages of SAS with regards to data manipulation may be due in part to
some users only using R/Splus for stat modeling and graphics (thus never
becoming familiar with the data manipulation capabilities of R/Splus) or to
the reluctance of SAS-trained individuals and companies to make the
complete switch?

Tony, the story about the famous software and the certain operating
system at the large company was priceless.

In closing, I should mention that in all posts I am speaking for myself and
not for Edwards LifeSciences.

Regards,
-Cody

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and clinical studies

2007-03-20 Thread Frank E Harrell Jr
[EMAIL PROTECTED] wrote:
 Thank you to all those that responded to Delphine's original post on R and
 clinical studies.  They have provided much food for thought.
 
 I had a couple of follow up questions/comments.  Andrew is very correct in
 pointing out that there are classes and workshops available for R.  It's my
 understanding that there are even commercial versions of R that now provide
 formal commercial-style courses.  And at any rate, the money saved by
 potentially avoiding pricey software could certainly justify any training
 expense in time or money  - this assumes of course that the pricey software
 could be dispensed with (I suspect that would take considerable time at my
 current company as so many legacy projects have been done in proprietary
 software).  I still think that R provides less 'hand-holding' and requires
 more initiative (which may be more or less present on a per
 programmer/statistician basis).
 
 I guess one could always integrate R/Splus in with SAS, as Terry's group
 has done at Mayo - I will probably do this at least as a start.  I have a
 few concerns with regards to this approach (these may be needless concerns,
 but I will venture expressing them anyway).  First, I'm worried about the
 possibility of compatability concerns (will anyone be worried about a SAS
 dataset read into R or vice-versa?).  Second, I would prefer focusing all
 my learning on one package if possible.  I actually have more experience
 with SAS (as do others in my group), and if the switch to R is to be made I
 would like to make that switch as complete as possible.   This would also
 avoid requiring new hires to know both languages.  Third, if SAS is to be
 kept around, it defeats one of the main advantages of having open source
 code in the first place (R is wonderfully free!).  Like Mayo, Baylor Health
 (my previous employer) used both Splus and SAS.  I was warned that data
 manipulation would be much more difficult in R/Splus than it was in SAS.
 To be honest, and I say this humbly realizing that most posters to this
 list have much more experience than I, I haven't found data manipulation to
 be that much more difficult in R/Splus (at least as I have gained
 experience in R/Splus).   I can think of two exceptions (1) large datasets
 and (2) SAS seems to play nicer with MS products (e.g. PROC IMPORT seemed
 to read in messy Excel spreadsheets better than importData in Splus).  Is
 it possible (and I again say this with MUCH humility) that the perceived
 advantages of SAS with regards to data manipulation may be due in part to
 some users only using R/Splus for stat modeling and graphics (thus never
 becoming familiar with the data manipulation capabilities of R/Splus) or to
 the reluctance of SAS-trained individuals and companies to make the
 complete switch?

You are exactly correct on this point.  Some graduate programs only 
teach students how to use R/S-Plus for modeling and graphics.  R/S-Plus 
are wonderful for data manipulation - more powerful than SAS but not 
easy to learn (plus in R there are sometimes too many ways to do 
something; new users get lost - e.g. the reshape and reShape functions 
and the reshape package). 
http://biostat.mc.vanderbilt.edu/twiki/pub/Main/RS/sintro.pdf has many 
examples of complex data manipulation as do some web sites.  We do 
analysis for pharmaceutical companies with 100% of the data manipulation 
done in R after importing say 50 SAS datasets into R.  Doing tasks such 
as finding a lab value measured the closest in time to some event is 
much more elegant in R/S-Plus than in SAS.

Frank

 
 Tony, the story about the famous software and the certain operating
 system at the large company was priceless.
 
 In closing, I should mention that in all posts I am speaking for myself and
 not for Edwards LifeSciences.
 
 Regards,
 -Cody
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and clinical studies

2007-03-19 Thread Terry Therneau
  A strength of R is that there is a wide variety of contribuitions to the
package, giving it great breadth.
  A weakness of R is that there is a wide variety of contributers to the 
package, some of whom spend a lot of time on the task of function correctness,
and some of whom spend little; some worry about backward compatability, some
sneer at the idea; some spend a lot of time on maintainance, and some don't
have the time to do so or move on to other things.

   The survival code, for instance, has a set of exact test cases.  These are
small data sets where the correct answer has been carefully worked out by
hand.  S (Splus or R) passes all the tests, SAS passes most of them.  (Most of
the tests are documented in an appendix of Therneau and Grambsch, Springer,
2000).  These test cases has been a great help in creating and debugging the
code, but overall represent a large amount of work.  Most code that does not
have a corporate sponsor will not have the resources to do this.  I have them
mostly because the survival library's genesis has been spread out over 20
years, and individual bits were important parts of clinical trials and so
HAD to be right.

  (Aside.  SAS has a deserved repuation for accuracy.  It has an undeserved
one for infallability --- one of my favorite bug reports for the S code
started out I've found a mistake in the coxph function, it gives a different
answer than SAS.  It turned out in that case that the S and SAS data sets
in their example were not quite the same.  As an earlier poster said, data
management and manipulation is the root of most errors.)

   Our group uses SAS for data manipulation primarily, and a mix of SAS and
S-Plus for the analysis.  It would be difficult to become a pure S shop, but
we've had no trouble with the mix.

Terry Therneau
Biostatistics, Mayo Clinic

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and clinical studies

2007-03-19 Thread Cody_Hamilton

Thank you to all those that responded to Delphine's original post on R and
clinical studies.  They have provided much food for thought.

I had a couple of follow up questions/comments.  Andrew is very correct in
pointing out that there are classes and workshops available for R.  It's my
understanding that there are even commercial versions of R that now provide
formal commercial-style courses.  And at any rate, the money saved by
potentially avoiding pricey software could certainly justify any training
expense in time or money  - this assumes of course that the pricey software
could be dispensed with (I suspect that would take considerable time at my
current company as so many legacy projects have been done in proprietary
software).  I still think that R provides less 'hand-holding' and requires
more initiative (which may be more or less present on a per
programmer/statistician basis).

I guess one could always integrate R/Splus in with SAS, as Terry's group
has done at Mayo - I will probably do this at least as a start.  I have a
few concerns with regards to this approach (these may be needless concerns,
but I will venture expressing them anyway).  First, I'm worried about the
possibility of compatability concerns (will anyone be worried about a SAS
dataset read into R or vice-versa?).  Second, I would prefer focusing all
my learning on one package if possible.  I actually have more experience
with SAS (as do others in my group), and if the switch to R is to be made I
would like to make that switch as complete as possible.   This would also
avoid requiring new hires to know both languages.  Third, if SAS is to be
kept around, it defeats one of the main advantages of having open source
code in the first place (R is wonderfully free!).  Like Mayo, Baylor Health
(my previous employer) used both Splus and SAS.  I was warned that data
manipulation would be much more difficult in R/Splus than it was in SAS.
To be honest, and I say this humbly realizing that most posters to this
list have much more experience than I, I haven't found data manipulation to
be that much more difficult in R/Splus (at least as I have gained
experience in R/Splus).   I can think of two exceptions (1) large datasets
and (2) SAS seems to play nicer with MS products (e.g. PROC IMPORT seemed
to read in messy Excel spreadsheets better than importData in Splus).  Is
it possible (and I again say this with MUCH humility) that the perceived
advantages of SAS with regards to data manipulation may be due in part to
some users only using R/Splus for stat modeling and graphics (thus never
becoming familiar with the data manipulation capabilities of R/Splus) or to
the reluctance of SAS-trained individuals and companies to make the
complete switch?

Tony, the story about the famous software and the certain operating
system at the large company was priceless.

In closing, I should mention that in all posts I am speaking for myself and
not for Edwards LifeSciences.

Regards,
-Cody

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and clinical studies

2007-03-17 Thread AJ Rossini
On Friday 16 March 2007 09:36, Delphine Fontaine wrote:
 Thanks for your answer which was very helpfull. I have another question:

 I have read in this document
 (http://cran.r-project.org/doc/manuals/R-intro.pdf) that most of the
 programs written in R are ephemeral and that new releases are not
 always compatible with previous releases. What I would like to know is
 if R functions are already validated and if not, what should we do to
 validate a R function ?

Validation is in the eye of the beholder. 

In particular, for clinical studies, from the corporate or institutional point 
of view, what we should do to validate an R function should be answered by 
the local Standard Operating Procedures (SOPs) for what should we do to 
validate a computer programming language function.   

If you are working with clinical trials as part of a health authority 
submission process, you should have those in place.  

Of course, what you probably are interested in is an approach where you 
qualify R, and validate programs and packages written for R, which might be 
another better approach, in which case the same applies.  Your SOPs should 
apply to both. 

(Now, assuming that you've done a reasonable job on the processes, as per 
Mats' answer, the point is that R vs. anything else is a simple red 
herring, as there is nothing in the spirit of the regulations which 
differentiates any of the characteristics of R with any other reasonable 
piece of software, for appropriate definitions of reasonableness).

digression title=semi-relevant, on SOPs and commercial software
I should point out that a certain large company I'm familiar with, who uses a 
certain famous piece of statistical software for activities perhaps 
described above, can't use the most recent version because of interesting 
issues with its self qualification tool, which prevents it from 
self-qualifying the new version on any installation on a certain operating 
system originating near where I used to live, when the previous version of 
the famous software had been installed.  This feature, if not reverted, would 
necessitate total disk wipe of ALL computers requiring qualification running 
this operating system, where the new version of this famous piece of software 
would be installed, if this certain large company wants to follow it's SOPs.   
This is apparently a feature, not a bug, and demonstrates clearly the 
benefits and joys of commercial support when millions of swiss francs of 
licensing fees are involved.
/digression

I'm not a lawyer, nor am I speaking for any corporation indirectly referenced 
above, nor will I provide sufficient justification to help anyone else take 
any of the statements as a fact.

best,
-tony

[EMAIL PROTECTED]
Muttenz, Switzerland.
Commit early,commit often, and commit in a repository from which we can 
easily roll-back your mistakes (AJR, 4Jan05).


pgpykulSxmZWi.pgp
Description: PGP signature
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and clinical studies

2007-03-16 Thread Delphine Fontaine
Thanks for your answer which was very helpfull. I have another question:

I have read in this document  
(http://cran.r-project.org/doc/manuals/R-intro.pdf) that most of the  
programs written in R are ephemeral and that new releases are not  
always compatible with previous releases. What I would like to know is  
if R functions are already validated and if not, what should we do to  
validate a R function ?

-- 
Delphine Fontaine


Quoting Soukup, Mat [EMAIL PROTECTED]:

 Delphine,

 Please see the following message posted a week ago:
 http://comments.gmane.org/gmane.comp.lang.r.general/80175.

 HTH,

 -Mat

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Delphine Fontaine
 Sent: Friday, March 09, 2007 8:29 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] R and clinical studies

 Does anyone know if for clinical studies the FDA would accept
 statistical analyses performed with R ?

 Delphine Fontaine

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and clinical studies

2007-03-16 Thread Frank E Harrell Jr
Delphine Fontaine wrote:
 Thanks for your answer which was very helpfull. I have another question:
 
 I have read in this document  
 (http://cran.r-project.org/doc/manuals/R-intro.pdf) that most of the  
 programs written in R are ephemeral and that new releases are not  
 always compatible with previous releases. What I would like to know is  
 if R functions are already validated and if not, what should we do to  
 validate a R function ?
 

In the sense in which most persons use the term 'validate', it means to 
show with one or more datasets that the function is capable of producing 
the right answer.  It doesn't mean that it produces the right answer for 
every dataset although we hope it does.  [As an aside, most errors are 
in the data manipulation phase, not in the analysis phase.]  So I think 
that instead of validating functions we should spend more effort on 
validating analyses [and validating analysis file derivation].  Pivotal 
analyses can be re-done a variety of ways, in R or in separate 
programmable packages such as Stata.

-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and clinical studies

2007-03-16 Thread Cody_Hamilton

I agree that most problems arise in the data management / file derivation
phase.  From my reading of 21 CFR 11, it appears that this document focuses
primarily on data management (as well as on software directly involved in a
medical device) rather than on validation of statistical functions.  I
believe this point has been made previously on the R-help list.

With regards to validating functions, I have often wondered how one can
validate a function when one cannot see what it is doing.  You could
certainly compare calculations from one package to the same calculations
from another package, but then you must purchase (ouch!) and know how to
properly use two software packages instead of one.  And I suppose they
could both be wrong!  Is not peer-review the best form of validation?  . .
. I suspect I may be preaching to the choir here.

I would love nothing more than to migrate our stat group over to R from
SAS.  Based on my experience with R/Splus, the language seems more
extendable, flexible, and has much better graphics (as has been pointed out
many times on this list).  It also has available the many contributions of
generous R users.  However, it has been hard to win pure SAS users onto R
(even if it saves the company money!).  One can't send the biostat group
off to R training like one would to SAS classes.  Learning R requires
initiative from the user (which is not necessarily a bad thing).  I
considered encouraging the purchase of Splus as an intermediate step
(hoping that its proprietary nature would soothe fears regarding open
source software), but that option was not as cheap as I thought.

Regards,
-Cody




Delphine Fontaine wrote:
 Thanks for your answer which was very helpfull. I have another question:

 I have read in this document
 (http://cran.r-project.org/doc/manuals/R-intro.pdf) that most of the
 programs written in R are ephemeral and that new releases are not
 always compatible with previous releases. What I would like to know is
 if R functions are already validated and if not, what should we do to
 validate a R function ?


In the sense in which most persons use the term 'validate', it means to
show with one or more datasets that the function is capable of producing
the right answer.  It doesn't mean that it produces the right answer for
every dataset although we hope it does.  [As an aside, most errors are
in the data manipulation phase, not in the analysis phase.]  So I think
that instead of validating functions we should spend more effort on
validating analyses [and validating analysis file derivation].  Pivotal
analyses can be re-done a variety of ways, in R or in separate
programmable packages such as Stata.

--
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and clinical studies

2007-03-09 Thread Soukup, Mat
Delphine,

Please see the following message posted a week ago:
http://comments.gmane.org/gmane.comp.lang.r.general/80175.

HTH,

-Mat 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Delphine Fontaine
Sent: Friday, March 09, 2007 8:29 AM
To: r-help@stat.math.ethz.ch
Subject: [R] R and clinical studies

Does anyone know if for clinical studies the FDA would accept  
statistical analyses performed with R ?

Delphine Fontaine

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.