Re: [R] Inefficiency of SAS Programming
Dear Ajay, just to deny the implicit statement 'corporate user'='moron' surfacing here and there in this interesting thread :^). This might be a statistical regularity but should by no means be considered a theorem, as there are counter-examples available. You can find people willing to learn both languages, appreciate the difference between them and use each where it is particularly strong even in corporations and burosaurs of any kind. IMVHO, acceptance of R in the corporate world has little to do with syntax and much with legacies, (discharge of-) responsibilities and the distance between the decision maker/buyer and those who are actually working with the SW. Else, assuming that 'corporate users' are not at a significant cerebral disadvantage (which I like to), the penetration of R in education, small and large companies should be the same, which I'm afraid is not. So I believe it boils down to industrial organization and the open source vs. commercial development model, rather than to some kind of (more or less appropriate) function rebranding. It is the *difference* in syntax w.r.t. SAS that prompted the shift to R, in my case at least. It was its ease and 'cleanliness' of installation (no registry entries, no access to forbidden directories required) which allowed me to experiment with it without having to mess with the IT Dept. (which would probably have put an end to my quest). It was its open source nature that allowed me to install it anywhere I liked to. My 2 Euro-Cents Giovanni Disclaimer: just thinking of the Proc Step gives me shivers; yet I recognize SAS is fast and powerful. I could understand somebody wanting to execute SAS through R syntax, but the opposite is beyond my grasp. -- Message: 72 Date: Wed, 4 Mar 2009 08:44:51 +1300 From: Rolf Turner r.tur...@auckland.ac.nz Subject: Re: [R] Inefficiency of SAS Programming To: Ajay ohri ohri2...@gmail.com Cc: r-help-boun...@r-project.org r-help-boun...@r-project.org, Gerard M. Keogh gmke...@justice.ie, list r-h...@stat.math.ethz.ch, R, Greg Snow greg.s...@imail.org Message-ID: 8993cba0-46a3-41de-abbb-29db205fb...@auckland.ac.nz Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed On 3/03/2009, at 5:58 PM, Ajay ohri wrote: for an inefficient language , it sure has dominated the predictive analytics world for 3 plus decades. I referred once to intellectual jealousy between newton and liebnitz. i am going ahead and creating the R package called Anne. It basically is meant only for SAS users who want to learn R , without upsetting the schedule of the corporate users. Simply put , it is a wrapper on SAS language using the function command...ie procunivariate function in Anne package would call the summary function and so on... Reminds me of fortune(38). cheers, Rolf Turner ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} Ai sensi del D.Lgs. 196/2003 si precisa che le informazi...{{dropped:13}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
Ajay, I think this is somewhat unnecessary! For regular SAS users PROC SUMMARY/MEANS is the data manipulation staple. A SAS user interested in R will look at Intro to R and find it recommends tapply for data manipulation jobs. This is a problem - tapply just isn't simple and obvious. A better option would be for the Intro to R to link to Bob Muenchen's original 80-page document R for SAS and SPSS users, page 52 where the summarize function is described. Gerard Ajay ohri ohri2...@gmail.c omTo Greg Snow greg.s...@imail.org 03/03/2009 04:58 cc Gerard M. Keogh gmke...@justice.ie, r-help-boun...@r-project.org r-help-boun...@r-project.org, R list r-h...@stat.math.ethz.ch Subject Re: [R] Inefficiency of SAS Programming for an inefficient language , it sure has dominated the predictive analytics world for 3 plus decades. I referred once to intellectual jealousy between newton and liebnitz. i am going ahead and creating the R package called Anne. It basically is meant only for SAS users who want to learn R , without upsetting the schedule of the corporate users. Simply put , it is a wrapper on SAS language using the function command...ie procunivariate function in Anne package would call the summary function and so on... Regards, Ajay www.decisionstats.com On Tue, Mar 3, 2009 at 9:20 AM, Greg Snow greg.s...@imail.org wrote: This does not really address my point. Yes, if the few nerds who want to do funny stuff are the ones making the purchase, then there is a good chance (but still not guaranteed) that they will get IML, but do all companies that buy SAS actually think about that, or do they just see the extra price (no matter how low), or not even think to look at that piece because the person making the purchase does not really the funny things you can do with it. If you want your SAS code to be able to be run by anyone with SAS, you cannot assume that they have IML. If you want your R code to be run by anyone, you cannot make your code dependent on packages/tools that are not available for all platforms. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: Gerard M. Keogh [mailto:gmke...@justice.ie] Sent: Monday, March 02, 2009 3:22 AM To: Greg Snow Cc: Frank E Harrell Jr; R list; r-help-boun...@r-project.org Subject: Re: [R] Inefficiency of SAS Programming Yes Greg, but if you're buying SAS they'll throw in IML pretty cheaply - SAS think it's only for a few nerds out there who wan to do funny stuff. G Greg Snow greg.s...@imail. org To Sent by: Gerard M. Keogh r-help-boun...@r- gmke...@justice.ie, Frank E project.org Harrell Jr f.harr...@vanderbilt.edu cc 27/02/2009 19:05 r-help-boun...@r-project.org r-help-boun...@r-project.org, R list r-h...@stat.math.ethz.ch Subject Re: [R] Inefficiency of SAS Programming But SAS/IML is not part of base SAS, it costs extra, so there is a good chance that a user that has SAS will not be able to run code that uses SAS/IML. I have known of SAS programmers who know IML well that still write matrix/vector tools using macros or proc transpose so that a user without IML can still use the code (the fact that the code
Re: [R] Inefficiency of SAS Programming
Ajay ohri wrote: for an inefficient language , it sure has dominated the predictive analytics world for 3 plus decades. I referred once to intellectual jealousy between newton and liebnitz. i am going ahead and creating the R package called Anne. If you want to market this, Ajay, I'd suggest a name like SASsieR. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
Ok. Basically everything that SAS can do, R can do, but vice versa isnt true. using the Anne package just renames the functions into standardized data and proc steps for user comfort. Once SAS user finds that R is productive , and useful and even more powerful for even less money, he can unload the Anne Package , and move straight away into R like intro of R' It is also a good personal exercise for me to learn how to create R packages. Jai ho !!! Ajay You can read more on this concept idea here (note it is an idea not a package now-- and so i have posted on it) http://www.decisionstats.com/2009/03/an-r-package-only-for-sas-users/ On Tue, Mar 3, 2009 at 10:28 AM, Ajay ohri ohri2...@gmail.com wrote: for an inefficient language , it sure has dominated the predictive analytics world for 3 plus decades. I referred once to intellectual jealousy between newton and liebnitz. i am going ahead and creating the R package called Anne. It basically is meant only for SAS users who want to learn R , without upsetting the schedule of the corporate users. Simply put , it is a wrapper on SAS language using the function command...ie procunivariate function in Anne package would call the summary function and so on... Regards, Ajay www.decisionstats.com- Show quoted text - On Tue, Mar 3, 2009 at 9:20 AM, Greg Snow greg.s...@imail.org wrote: This does not really address my point. Yes, if the few nerds who want to do funny stuff are the ones making the purchase, then there is a good chance (but still not guaranteed) that they will get IML, but do all companies that buy SAS actually think about that, or do they just see the extra price (no matter how low), or not even think to look at that piece because the person making the purchase does not really the funny things you can do with it. If you want your SAS code to be able to be run by anyone with SAS, you cannot assume that they have IML. If you want your R code to be run by anyone, you cannot make your code dependent on packages/tools that are not available for all platforms. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: Gerard M. Keogh [mailto:gmke...@justice.ie] Sent: Monday, March 02, 2009 3:22 AM To: Greg Snow Cc: Frank E Harrell Jr; R list; r-help-boun...@r-project.org Subject: Re: [R] Inefficiency of SAS Programming Yes Greg, but if you're buying SAS they'll throw in IML pretty cheaply - SAS think it's only for a few nerds out there who wan to do funny stuff. G Greg Snow greg.s...@imail. org To Sent by: Gerard M. Keogh r-help-boun...@r- gmke...@justice.ie, Frank E project.org Harrell Jr f.harr...@vanderbilt.edu cc 27/02/2009 19:05 r-help-boun...@r-project.org r-help-boun...@r-project.org, R list r-h...@stat.math.ethz.ch Subject Re: [R] Inefficiency of SAS Programming But SAS/IML is not part of base SAS, it costs extra, so there is a good chance that a user that has SAS will not be able to run code that uses SAS/IML. I have known of SAS programmers who know IML well that still write matrix/vector tools using macros or proc transpose so that a user without IML can still use the code (the fact that the code that started this thread was found on a website, suggests that it was meant for general use rather than something only used internally where you know what add-ons will be available). Just another way that R makes life easier for both programmer and user. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Gerard M. Keogh Sent: Friday, February 27, 2009 7:19 AM To: Frank E Harrell Jr Cc: r-help-boun...@r-project.org; R list Subject: Re: [R] Inefficiency of SAS Programming Yes Frank, I accept your point but nevertheless IML is the proper place for matrix work in SAS - mixing macro-level logic and computation is another question - R is certainly more seemless in this respect. Gerard Frank E Harrell Jr f.harr...@vander To bilt.edu Gerard M. Keogh gmke...@justice.ie 27/02/2009 13:55 cc R list r- h...@stat.math.ethz.ch
Re: [R] Inefficiency of SAS Programming
no market for R packages exists in true economic sense as there is demand and supply and utility but no price Ajay Did Tom Sawyer create the first collaborative project ever ( to paint the fence ?) On Tue, Mar 3, 2009 at 4:24 PM, Jim Lemon j...@bitwrit.com.au wrote: Ajay ohri wrote: for an inefficient language , it sure has dominated the predictive analytics world for 3 plus decades. I referred once to intellectual jealousy between newton and liebnitz. i am going ahead and creating the R package called Anne. If you want to market this, Ajay, I'd suggest a name like SASsieR. Jim [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
2009/3/3 Jim Lemon j...@bitwrit.com.au: Ajay ohri wrote: for an inefficient language , it sure has dominated the predictive analytics world for 3 plus decades. I referred once to intellectual jealousy between newton and liebnitz. i am going ahead and creating the R package called Anne. If you want to market this, Ajay, I'd suggest a name like SASsieR. Nice name. I'd call it 'Ursas', which works on several levels: 1. U-R-SAS. (pun on You Are SAS, or You R-SAS) 2. Ur-SAS. (Germanic prefix Ur- meaning 'proto' or 'primitive')[1] 3. Ursas (from Latin 'Ursa' meaning 'bear', because it seems very few R users can bear SAS) Barry [1] http://en.wiktionary.org/wiki/ur- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
Ajay ohri wrote: for an inefficient language , it sure has dominated the predictive analytics world for 3 plus decades. I referred once to intellectual jealousy between newton and liebnitz. i am going ahead and creating the R package called Anne. It basically is meant only for SAS users who want to learn R , without upsetting the schedule of the corporate users. Simply put , it is a wrapper on SAS language using the function command...ie procunivariate function in Anne package would call the summary function and so on... Go ahead and add to the confusion. You've already created some by using summary for procunivariate. I created the describe function in the Hmisc package to replace univariate. Frank Regards, Ajay www.decisionstats.com On Tue, Mar 3, 2009 at 9:20 AM, Greg Snow greg.s...@imail.org wrote: This does not really address my point. Yes, if the few nerds who want to do funny stuff are the ones making the purchase, then there is a good chance (but still not guaranteed) that they will get IML, but do all companies that buy SAS actually think about that, or do they just see the extra price (no matter how low), or not even think to look at that piece because the person making the purchase does not really the funny things you can do with it. If you want your SAS code to be able to be run by anyone with SAS, you cannot assume that they have IML. If you want your R code to be run by anyone, you cannot make your code dependent on packages/tools that are not available for all platforms. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: Gerard M. Keogh [mailto:gmke...@justice.ie] Sent: Monday, March 02, 2009 3:22 AM To: Greg Snow Cc: Frank E Harrell Jr; R list; r-help-boun...@r-project.org Subject: Re: [R] Inefficiency of SAS Programming Yes Greg, but if you're buying SAS they'll throw in IML pretty cheaply - SAS think it's only for a few nerds out there who wan to do funny stuff. G Greg Snow greg.s...@imail. org To Sent by: Gerard M. Keogh r-help-boun...@r- gmke...@justice.ie, Frank E project.org Harrell Jr f.harr...@vanderbilt.edu cc 27/02/2009 19:05 r-help-boun...@r-project.org r-help-boun...@r-project.org, R list r-h...@stat.math.ethz.ch Subject Re: [R] Inefficiency of SAS Programming But SAS/IML is not part of base SAS, it costs extra, so there is a good chance that a user that has SAS will not be able to run code that uses SAS/IML. I have known of SAS programmers who know IML well that still write matrix/vector tools using macros or proc transpose so that a user without IML can still use the code (the fact that the code that started this thread was found on a website, suggests that it was meant for general use rather than something only used internally where you know what add-ons will be available). Just another way that R makes life easier for both programmer and user. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Gerard M. Keogh Sent: Friday, February 27, 2009 7:19 AM To: Frank E Harrell Jr Cc: r-help-boun...@r-project.org; R list Subject: Re: [R] Inefficiency of SAS Programming Yes Frank, I accept your point but nevertheless IML is the proper place for matrix work in SAS - mixing macro-level logic and computation is another question - R is certainly more seemless in this respect. Gerard Frank E Harrell Jr f.harr...@vander To bilt.edu Gerard M. Keogh gmke...@justice.ie 27/02/2009 13:55 cc R list r- h...@stat.math.ethz.ch, r-help-boun...@r-project.org Subject Re: [R] Inefficiency of SAS Programming Gerard M. Keogh wrote: Frank, I can't see the code you mention - Web marshall at work - but I don't think you should be too quick to run down SAS - it's a powerful and flexible language but unfortunately very expensive. Your example mentions doing a vector product in the macro language - this only suggest to me that those people writing the code need a crash course in SAS/IML (the matrix language). SAS is designed to work on records and so is inapproprorriate for matrices - macros are only an efficient code copying device. Doing matrix
Re: [R] Inefficiency of SAS Programming
On Mar 3, 9:58 am, Ajay ohri ohri2...@gmail.com wrote: for an inefficient language , it sure has dominated the predictive analytics world for 3 plus decades. I referred once to intellectual jealousy between newton and liebnitz. i am going ahead and creating the R package called Anne. It basically is meant only for SAS users who want to learn R , without upsetting the schedule of the corporate users. Simply put , it is a wrapper on SAS language using the function command...ie procunivariate function in Anne package would call the summary function and so on... Regards, Ajay www.decisionstats.com = Bob Muenchen's book R for SAS and SPSS users provides a systematic transition plan (if I may use that term) for SAS and SPSS users intending to work on/migrate to R. Having been a newly transformed R user myself, I'm inclined to believe that creating yet another package that just houses some SAS procedures--sounding names for data manipulation/summarization would add to fair bit of confusion. my $.02.. -Girish __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
why didnt you call it procunivariate if that was exactly what you wanted to do . On Tue, Mar 3, 2009 at 7:11 PM, Frank E Harrell Jr f.harr...@vanderbilt.edu wrote: Ajay ohri wrote: for an inefficient language , it sure has dominated the predictive analytics world for 3 plus decades. I referred once to intellectual jealousy between newton and liebnitz. i am going ahead and creating the R package called Anne. It basically is meant only for SAS users who want to learn R , without upsetting the schedule of the corporate users. Simply put , it is a wrapper on SAS language using the function command...ie procunivariate function in Anne package would call the summary function and so on... Go ahead and add to the confusion. You've already created some by using summary for procunivariate. I created the describe function in the Hmisc package to replace univariate. Frank Regards, Ajay www.decisionstats.com On Tue, Mar 3, 2009 at 9:20 AM, Greg Snow greg.s...@imail.org wrote: This does not really address my point. Yes, if the few nerds who want to do funny stuff are the ones making the purchase, then there is a good chance (but still not guaranteed) that they will get IML, but do all companies that buy SAS actually think about that, or do they just see the extra price (no matter how low), or not even think to look at that piece because the person making the purchase does not really the funny things you can do with it. If you want your SAS code to be able to be run by anyone with SAS, you cannot assume that they have IML. If you want your R code to be run by anyone, you cannot make your code dependent on packages/tools that are not available for all platforms. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: Gerard M. Keogh [mailto:gmke...@justice.ie] Sent: Monday, March 02, 2009 3:22 AM To: Greg Snow Cc: Frank E Harrell Jr; R list; r-help-boun...@r-project.org Subject: Re: [R] Inefficiency of SAS Programming Yes Greg, but if you're buying SAS they'll throw in IML pretty cheaply - SAS think it's only for a few nerds out there who wan to do funny stuff. G Greg Snow greg.s...@imail. org To Sent by: Gerard M. Keogh r-help-boun...@r- gmke...@justice.ie, Frank E project.org Harrell Jr f.harr...@vanderbilt.edu cc 27/02/2009 19:05 r-help-boun...@r-project.org r-help-boun...@r-project.org, R list r-h...@stat.math.ethz.ch Subject Re: [R] Inefficiency of SAS Programming But SAS/IML is not part of base SAS, it costs extra, so there is a good chance that a user that has SAS will not be able to run code that uses SAS/IML. I have known of SAS programmers who know IML well that still write matrix/vector tools using macros or proc transpose so that a user without IML can still use the code (the fact that the code that started this thread was found on a website, suggests that it was meant for general use rather than something only used internally where you know what add-ons will be available). Just another way that R makes life easier for both programmer and user. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Gerard M. Keogh Sent: Friday, February 27, 2009 7:19 AM To: Frank E Harrell Jr Cc: r-help-boun...@r-project.org; R list Subject: Re: [R] Inefficiency of SAS Programming Yes Frank, I accept your point but nevertheless IML is the proper place for matrix work in SAS - mixing macro-level logic and computation is another question - R is certainly more seemless in this respect. Gerard Frank E Harrell Jr f.harr...@vander To bilt.edu Gerard M. Keogh gmke...@justice.ie 27/02/2009 13:55 cc R list r- h...@stat.math.ethz.ch, r-help-boun...@r-project.org Subject Re: [R] Inefficiency of SAS Programming Gerard M. Keogh wrote: Frank, I can't see the code you mention - Web marshall at work - but I don't think you should be too quick to run down SAS - it's a powerful and flexible language but unfortunately very expensive. Your example mentions doing a vector product in the macro language
Re: [R] Inefficiency of SAS Programming
Ajay ohri wrote: why didnt you call it procunivariate if that was exactly what you wanted to do . Why would I ever do that? describe is a improvement on univariate, saving an estimated 87 +/- 87 trees per year in paper by printing what you need in much less space and concentrating on real descriptive statistics. Frank On Tue, Mar 3, 2009 at 7:11 PM, Frank E Harrell Jr f.harr...@vanderbilt.edu mailto:f.harr...@vanderbilt.edu wrote: Ajay ohri wrote: for an inefficient language , it sure has dominated the predictive analytics world for 3 plus decades. I referred once to intellectual jealousy between newton and liebnitz. i am going ahead and creating the R package called Anne. It basically is meant only for SAS users who want to learn R , without upsetting the schedule of the corporate users. Simply put , it is a wrapper on SAS language using the function command...ie procunivariate function in Anne package would call the summary function and so on... Go ahead and add to the confusion. You've already created some by using summary for procunivariate. I created the describe function in the Hmisc package to replace univariate. Frank Regards, Ajay www.decisionstats.com http://www.decisionstats.com On Tue, Mar 3, 2009 at 9:20 AM, Greg Snow greg.s...@imail.org mailto:greg.s...@imail.org wrote: This does not really address my point. Yes, if the few nerds who want to do funny stuff are the ones making the purchase, then there is a good chance (but still not guaranteed) that they will get IML, but do all companies that buy SAS actually think about that, or do they just see the extra price (no matter how low), or not even think to look at that piece because the person making the purchase does not really the funny things you can do with it. If you want your SAS code to be able to be run by anyone with SAS, you cannot assume that they have IML. If you want your R code to be run by anyone, you cannot make your code dependent on packages/tools that are not available for all platforms. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org mailto:greg.s...@imail.org 801.408.8111 -Original Message- From: Gerard M. Keogh [mailto:gmke...@justice.ie mailto:gmke...@justice.ie] Sent: Monday, March 02, 2009 3:22 AM To: Greg Snow Cc: Frank E Harrell Jr; R list; r-help-boun...@r-project.org mailto:r-help-boun...@r-project.org Subject: Re: [R] Inefficiency of SAS Programming Yes Greg, but if you're buying SAS they'll throw in IML pretty cheaply - SAS think it's only for a few nerds out there who wan to do funny stuff. G Greg Snow greg.s...@imail. org To Sent by: Gerard M. Keogh r-help-boun...@r- gmke...@justice.ie mailto:gmke...@justice.ie, Frank E project.org http://project.org Harrell Jr f.harr...@vanderbilt.edu mailto:f.harr...@vanderbilt.edu cc 27/02/2009 19:05 r-help-boun...@r-project.org mailto:r-help-boun...@r-project.org r-help-boun...@r-project.org mailto:r-help-boun...@r-project.org, R list r-h...@stat.math.ethz.ch mailto:r-h...@stat.math.ethz.ch Subject Re: [R] Inefficiency of SAS Programming But SAS/IML is not part of base SAS, it costs extra, so there is a good chance that a user that has SAS will not be able to run code that uses SAS/IML. I have known of SAS programmers who know IML well that still write matrix/vector tools using macros or proc transpose so
Re: [R] Inefficiency of SAS Programming
On 3/03/2009, at 5:58 PM, Ajay ohri wrote: for an inefficient language , it sure has dominated the predictive analytics world for 3 plus decades. I referred once to intellectual jealousy between newton and liebnitz. i am going ahead and creating the R package called Anne. It basically is meant only for SAS users who want to learn R , without upsetting the schedule of the corporate users. Simply put , it is a wrapper on SAS language using the function command...ie procunivariate function in Anne package would call the summary function and so on... Reminds me of fortune(38). cheers, Rolf Turner ## Attention:\ This e-mail message is privileged and confid...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
Yes Greg, but if you're buying SAS they'll throw in IML pretty cheaply - SAS think it's only for a few nerds out there who wan to do funny stuff. G Greg Snow greg.s...@imail. org To Sent by: Gerard M. Keogh r-help-boun...@r- gmke...@justice.ie, Frank E project.org Harrell Jr f.harr...@vanderbilt.edu cc 27/02/2009 19:05 r-help-boun...@r-project.org r-help-boun...@r-project.org, R list r-h...@stat.math.ethz.ch Subject Re: [R] Inefficiency of SAS Programming But SAS/IML is not part of base SAS, it costs extra, so there is a good chance that a user that has SAS will not be able to run code that uses SAS/IML. I have known of SAS programmers who know IML well that still write matrix/vector tools using macros or proc transpose so that a user without IML can still use the code (the fact that the code that started this thread was found on a website, suggests that it was meant for general use rather than something only used internally where you know what add-ons will be available). Just another way that R makes life easier for both programmer and user. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Gerard M. Keogh Sent: Friday, February 27, 2009 7:19 AM To: Frank E Harrell Jr Cc: r-help-boun...@r-project.org; R list Subject: Re: [R] Inefficiency of SAS Programming Yes Frank, I accept your point but nevertheless IML is the proper place for matrix work in SAS - mixing macro-level logic and computation is another question - R is certainly more seemless in this respect. Gerard Frank E Harrell Jr f.harr...@vander To bilt.edu Gerard M. Keogh gmke...@justice.ie 27/02/2009 13:55 cc R list r- h...@stat.math.ethz.ch, r-help-boun...@r-project.org Subject Re: [R] Inefficiency of SAS Programming Gerard M. Keogh wrote: Frank, I can't see the code you mention - Web marshall at work - but I don't think you should be too quick to run down SAS - it's a powerful and flexible language but unfortunately very expensive. Your example mentions doing a vector product in the macro language - this only suggest to me that those people writing the code need a crash course in SAS/IML (the matrix language). SAS is designed to work on records and so is inapproprorriate for matrices - macros are only an efficient code copying device. Doing matrix computations in this way is pretty mad and the code would be impossible never mind the memory problems. SAS recognise that but a lot of SAS users remain familiar with IML. In IML by contrast there are inner, cross and outer products and a raft of other useful methods for matrix work that R users would be familiar with. OLS for example is one line: b = solve(X`X, X`y) ; rss = sqrt(ssq(y - Xb)) ; And to give you a flavour of IML's capabilities I implemented a SAS version of the MARS program in it about 6 or 7 years ago. BTW SPSS also has a matrix language. Gerard But try this: PROC IML; ... some custom user code ... ... loop over j=1 to 10 ... ... PROC GENMOD, output results back to IML ... IML is only a partial solution since it is not integrated with the PROC step. Frank Frank E Harrell Jr f.harr...@vander
Re: [R] Inefficiency of SAS Programming
R depends on all of those things to run, but you only have to use those programs through R. The software depends on these other tools, but the human doesn't have to switch interfaces. Tom! On Fri, Feb 27, 2009 at 9:22 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Fri, Feb 27, 2009 at 8:53 AM, Frank E Harrell Jr f.harr...@vanderbilt.edu wrote: Ajay ohri wrote: Sometimes for the sake of simplicity, SAS coding is created like that. One can use the concatenate function and drag and drop in an simple excel sheet for creating elaborate SAS code like the one mentioned and without any time at all. A system that requires Excel for its success is not a complete system. To be fair R depends on perl (although this dependence seems to be decreasing lately and possibly will be eliminated), latex and a bunch of unix tools. Developing GUIs depends on tcl/tk or other external system and developing fast code can require that some of it be written in C. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
On 3/2/2009 6:57 AM, Thomas Levine wrote: R depends on all of those things to run, but you only have to use those programs through R. The software depends on these other tools, but the human doesn't have to switch interfaces. In fact, it doesn't even depend on them to run. Most Windows users don't have perl, latex, any of the unix tools, an external tcl/tk system (R includes one), or a C compiler. (Most *nix users have them, but don't need them for running R, with the exception of tcl/tk, which is used by a number of packages.) You only need them to build packages, or to build R. Duncan Murdoch Tom! On Fri, Feb 27, 2009 at 9:22 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Fri, Feb 27, 2009 at 8:53 AM, Frank E Harrell Jr f.harr...@vanderbilt.edu wrote: Ajay ohri wrote: Sometimes for the sake of simplicity, SAS coding is created like that. One can use the concatenate function and drag and drop in an simple excel sheet for creating elaborate SAS code like the one mentioned and without any time at all. A system that requires Excel for its success is not a complete system. To be fair R depends on perl (although this dependence seems to be decreasing lately and possibly will be eliminated), latex and a bunch of unix tools. Developing GUIs depends on tcl/tk or other external system and developing fast code can require that some of it be written in C. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
If you want to write Sweave reports you have to learn latex and R does not hide that from you. This situation is somewhat better for tcltk, especially if you use one of the higher level wrapper packages that use it, but for serious work directly with it you need tcl/tk materials. On Mon, Mar 2, 2009 at 6:57 AM, Thomas Levine thomas.lev...@gmail.com wrote: R depends on all of those things to run, but you only have to use those programs through R. The software depends on these other tools, but the human doesn't have to switch interfaces. Tom! On Fri, Feb 27, 2009 at 9:22 PM, Gabor Grothendieck ggrothendi...@gmail.com wrote: On Fri, Feb 27, 2009 at 8:53 AM, Frank E Harrell Jr f.harr...@vanderbilt.edu wrote: Ajay ohri wrote: Sometimes for the sake of simplicity, SAS coding is created like that. One can use the concatenate function and drag and drop in an simple excel sheet for creating elaborate SAS code like the one mentioned and without any time at all. A system that requires Excel for its success is not a complete system. To be fair R depends on perl (although this dependence seems to be decreasing lately and possibly will be eliminated), latex and a bunch of unix tools. Developing GUIs depends on tcl/tk or other external system and developing fast code can require that some of it be written in C. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
This does not really address my point. Yes, if the few nerds who want to do funny stuff are the ones making the purchase, then there is a good chance (but still not guaranteed) that they will get IML, but do all companies that buy SAS actually think about that, or do they just see the extra price (no matter how low), or not even think to look at that piece because the person making the purchase does not really the funny things you can do with it. If you want your SAS code to be able to be run by anyone with SAS, you cannot assume that they have IML. If you want your R code to be run by anyone, you cannot make your code dependent on packages/tools that are not available for all platforms. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: Gerard M. Keogh [mailto:gmke...@justice.ie] Sent: Monday, March 02, 2009 3:22 AM To: Greg Snow Cc: Frank E Harrell Jr; R list; r-help-boun...@r-project.org Subject: Re: [R] Inefficiency of SAS Programming Yes Greg, but if you're buying SAS they'll throw in IML pretty cheaply - SAS think it's only for a few nerds out there who wan to do funny stuff. G Greg Snow greg.s...@imail. org To Sent by: Gerard M. Keogh r-help-boun...@r- gmke...@justice.ie, Frank E project.org Harrell Jr f.harr...@vanderbilt.edu cc 27/02/2009 19:05 r-help-boun...@r-project.org r-help-boun...@r-project.org, R list r-h...@stat.math.ethz.ch Subject Re: [R] Inefficiency of SAS Programming But SAS/IML is not part of base SAS, it costs extra, so there is a good chance that a user that has SAS will not be able to run code that uses SAS/IML. I have known of SAS programmers who know IML well that still write matrix/vector tools using macros or proc transpose so that a user without IML can still use the code (the fact that the code that started this thread was found on a website, suggests that it was meant for general use rather than something only used internally where you know what add-ons will be available). Just another way that R makes life easier for both programmer and user. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Gerard M. Keogh Sent: Friday, February 27, 2009 7:19 AM To: Frank E Harrell Jr Cc: r-help-boun...@r-project.org; R list Subject: Re: [R] Inefficiency of SAS Programming Yes Frank, I accept your point but nevertheless IML is the proper place for matrix work in SAS - mixing macro-level logic and computation is another question - R is certainly more seemless in this respect. Gerard Frank E Harrell Jr f.harr...@vander To bilt.edu Gerard M. Keogh gmke...@justice.ie 27/02/2009 13:55 cc R list r- h...@stat.math.ethz.ch, r-help-boun...@r-project.org Subject Re: [R] Inefficiency of SAS Programming Gerard M. Keogh wrote: Frank, I can't see the code you mention - Web marshall at work - but I don't think you should be too quick to run down SAS - it's a powerful and flexible language but unfortunately very expensive. Your example mentions doing a vector product in the macro language - this only suggest to me that those people writing the code need a crash course in SAS/IML (the matrix language). SAS is designed to work on records and so is inapproprorriate for matrices - macros are only an efficient code copying device. Doing matrix computations in this way is pretty mad and the code would be impossible never mind the memory problems. SAS recognise that but a lot of SAS users remain familiar with IML. In IML by contrast there are inner, cross and outer products and a raft of other useful methods for matrix work that R users would be familiar with. OLS for example is one line: b = solve(X`X, X`y) ; rss = sqrt(ssq(y - Xb)) ; And to give you a flavour of IML's capabilities I implemented a SAS version of the MARS program in it about 6 or 7 years ago. BTW SPSS also has a matrix language. Gerard But try this: PROC IML; ... some
Re: [R] Inefficiency of SAS Programming
for an inefficient language , it sure has dominated the predictive analytics world for 3 plus decades. I referred once to intellectual jealousy between newton and liebnitz. i am going ahead and creating the R package called Anne. It basically is meant only for SAS users who want to learn R , without upsetting the schedule of the corporate users. Simply put , it is a wrapper on SAS language using the function command...ie procunivariate function in Anne package would call the summary function and so on... Regards, Ajay www.decisionstats.com On Tue, Mar 3, 2009 at 9:20 AM, Greg Snow greg.s...@imail.org wrote: This does not really address my point. Yes, if the few nerds who want to do funny stuff are the ones making the purchase, then there is a good chance (but still not guaranteed) that they will get IML, but do all companies that buy SAS actually think about that, or do they just see the extra price (no matter how low), or not even think to look at that piece because the person making the purchase does not really the funny things you can do with it. If you want your SAS code to be able to be run by anyone with SAS, you cannot assume that they have IML. If you want your R code to be run by anyone, you cannot make your code dependent on packages/tools that are not available for all platforms. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: Gerard M. Keogh [mailto:gmke...@justice.ie] Sent: Monday, March 02, 2009 3:22 AM To: Greg Snow Cc: Frank E Harrell Jr; R list; r-help-boun...@r-project.org Subject: Re: [R] Inefficiency of SAS Programming Yes Greg, but if you're buying SAS they'll throw in IML pretty cheaply - SAS think it's only for a few nerds out there who wan to do funny stuff. G Greg Snow greg.s...@imail. org To Sent by: Gerard M. Keogh r-help-boun...@r- gmke...@justice.ie, Frank E project.org Harrell Jr f.harr...@vanderbilt.edu cc 27/02/2009 19:05 r-help-boun...@r-project.org r-help-boun...@r-project.org, R list r-h...@stat.math.ethz.ch Subject Re: [R] Inefficiency of SAS Programming But SAS/IML is not part of base SAS, it costs extra, so there is a good chance that a user that has SAS will not be able to run code that uses SAS/IML. I have known of SAS programmers who know IML well that still write matrix/vector tools using macros or proc transpose so that a user without IML can still use the code (the fact that the code that started this thread was found on a website, suggests that it was meant for general use rather than something only used internally where you know what add-ons will be available). Just another way that R makes life easier for both programmer and user. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Gerard M. Keogh Sent: Friday, February 27, 2009 7:19 AM To: Frank E Harrell Jr Cc: r-help-boun...@r-project.org; R list Subject: Re: [R] Inefficiency of SAS Programming Yes Frank, I accept your point but nevertheless IML is the proper place for matrix work in SAS - mixing macro-level logic and computation is another question - R is certainly more seemless in this respect. Gerard Frank E Harrell Jr f.harr...@vander To bilt.edu Gerard M. Keogh gmke...@justice.ie 27/02/2009 13:55 cc R list r- h...@stat.math.ethz.ch, r-help-boun...@r-project.org Subject Re: [R] Inefficiency of SAS Programming Gerard M. Keogh wrote: Frank, I can't see the code you mention - Web marshall at work - but I don't think you should be too quick to run down SAS - it's a powerful and flexible language but unfortunately very expensive. Your example mentions doing a vector product in the macro language - this only suggest to me that those people writing the code need a crash course in SAS/IML (the matrix language). SAS is designed to work on records and so is inapproprorriate
Re: [R] Inefficiency of SAS Programming
Frank, I can't see the code you mention - Web marshall at work - but I don't think you should be too quick to run down SAS - it's a powerful and flexible language but unfortunately very expensive. Your example mentions doing a vector product in the macro language - this only suggest to me that those people writing the code need a crash course in SAS/IML (the matrix language). SAS is designed to work on records and so is inapproprorriate for matrices - macros are only an efficient code copying device. Doing matrix computations in this way is pretty mad and the code would be impossible never mind the memory problems. SAS recognise that but a lot of SAS users remain familiar with IML. In IML by contrast there are inner, cross and outer products and a raft of other useful methods for matrix work that R users would be familiar with. OLS for example is one line: b = solve(X`X, X`y) ; rss = sqrt(ssq(y - Xb)) ; And to give you a flavour of IML's capabilities I implemented a SAS version of the MARS program in it about 6 or 7 years ago. BTW SPSS also has a matrix language. Gerard Frank E Harrell Jr f.harr...@vander To bilt.edu R list r-h...@stat.math.ethz.ch Sent by: cc r-help-boun...@r- project.org Subject [R] Inefficiency of SAS Programming 26/02/2009 22:57 If anyone wants to see a prime example of how inefficient it is to program in SAS, take a look at the SAS programs provided by the US Agency for Healthcare Research and Quality for risk adjusting and reporting for hospital outcomes at http://www.qualityindicators.ahrq.gov/software.htm . The PSSASP3.SAS program is a prime example. Look at how you do a vector product in the SAS macro language to evaluate predictions from a logistic regression model. I estimate that using R would easily cut the programming time of this set of programs by a factor of 4. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ** The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer. It is the policy of the Department of Justice, Equality and Law Reform and the Agencies and Offices using its IT services to disallow the sending of offensive material. Should you consider that the material contained in this message is offensive you should contact the sender immediately and also mailminder[at]justice.ie. Is le haghaidh an duine nó an eintitis ar a bhfuil sí dírithe, agus le haghaidh an duine nó an eintitis sin amháin, a bheartaítear an fhaisnéis a tarchuireadh agus féadfaidh sé go bhfuil ábhar faoi rún agus/nó faoi phribhléid inti. Toirmisctear aon athbhreithniú, atarchur nó leathadh a dhéanamh ar an bhfaisnéis seo, aon úsáid eile a bhaint aisti nó aon ghníomh a dhéanamh ar a hiontaoibh, ag daoine nó ag eintitis seachas an faighteoir beartaithe. Má fuair tú é seo trí dhearmad, téigh i dteagmháil leis an seoltóir, le do thoil, agus scrios an t-ábhar as aon ríomhaire. Is é beartas na Roinne Dlí agus Cirt, Comhionannais agus Athchóirithe Dlí, agus na nOifígí agus na nGníomhaireachtaí a úsáideann seirbhísí TF na Roinne, seoladh ábhair cholúil a dhícheadú. Más rud é go measann tú gur ábhar colúil atá san ábhar atá sa teachtaireacht seo is ceart duit dul i dteagmháil leis an seoltóir láithreach agus
Re: [R] Inefficiency of SAS Programming
I would like to know if we can create a package in which r functions are renamed closer to sas language.doing so will help people familiar to SAS to straight away take to R for their work,thus decreasing the threshold for acceptance - and then get into deeper understanding later. since it is a package it would be optional only for people wanting to try out R from SAS.. Do we have such a package right now..it basically masks R functions to the equivalent function in another language just for user ease /beginners for example creating function for means procmeans-function(x,y) + { summary ( subset(x,select=c(x,y)) + ) creating function for importing csv procimport -function(x,y) + { read.csv( textConnection(x),row.names=y,na.strings= + ) creating function fo describing data procunivariate-function(x)+ { summary(x) + ) regards, ajay www.decisionstats.com On Fri, Feb 27, 2009 at 4:27 AM, Frank E Harrell Jr f.harr...@vanderbilt.edu wrote: If anyone wants to see a prime example of how inefficient it is to program in SAS, take a look at the SAS programs provided by the US Agency for Healthcare Research and Quality for risk adjusting and reporting for hospital outcomes at http://www.qualityindicators.ahrq.gov/software.htm . The PSSASP3.SAS program is a prime example. Look at how you do a vector product in the SAS macro language to evaluate predictions from a logistic regression model. I estimate that using R would easily cut the programming time of this set of programs by a factor of 4. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
2009/2/27 Peter Dalgaard p.dalga...@biostat.ku.dk: Presumably, something like IF N. = 1 THEN SUB_N = 1; ELSE IF N. 5 THEN SUB_N = N.-1; ELSE IF N. 16 THEN SUB_N = N.-2; ELSE SUB_N = N.-3; would work, provided that 2, 5, 16 are impossible values. Problem is that it actually makes the code harder to grasp, so experienced SAS programmers go for the dumb but readable code like the above. I'm not sure which is easier to grasp. When I first saw the original version I thought it was an odd way of doing SUB_N = N.. Only then did I have a closer look and spot the missing 2, 5, and 16. A comment would have been very enlightening. But there was nothing relevant. In R, the cleanest I can think of is subn - match(n, setdiff(1:19, c(2,5,16))) or maybe just subn - match(n, c(1, 3:4, 6:15, 17:19)) although subn - factor(n, levels = c(1, 3:4, 6:15, 17:19)) might be what is really wanted I think the important thing with any programming is to make sure what you want is expressed in words somewhere. If not in the code, then in the comments. And operations like this should be abstracted into functions. All the examples of SAS code I've seen seem to fall into the old practices of writing great long 'scripts', with minimal code-reuse and encapsulation of useful functionality. If these SAS scripts are then given to new SAS programmers then the chances are they will follow these bad practices. Show them well-written R code (or C, or Python) and maybe they can implement those good practices into their SAS work. Assuming SAS can do that. I'm not sure. Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
Wensui Liu wrote: Thanks for pointing me to the SAS code, Dr Harrell After reading codes, I have to say that the inefficiency is not related to SAS language itself but the SAS programmer. An experienced SAS programmer won't use much of hard-coding, very adhoc and difficult to maintain. I agree with you that in the SAS code, it is a little too much to evaluate predictions. such complex data step actually can be replaced by simpler iml code. Agreed that the SAS code could have been much better. I programmed in SAS for 23 years and would have done it much differently. But you will find that the most elegant SAS program re-write will still be a far cry from the elegance of R. Frank On Thu, Feb 26, 2009 at 5:57 PM, Frank E Harrell Jr f.harr...@vanderbilt.edu wrote: If anyone wants to see a prime example of how inefficient it is to program in SAS, take a look at the SAS programs provided by the US Agency for Healthcare Research and Quality for risk adjusting and reporting for hospital outcomes at http://www.qualityindicators.ahrq.gov/software.htm . The PSSASP3.SAS program is a prime example. Look at how you do a vector product in the SAS macro language to evaluate predictions from a logistic regression model. I estimate that using R would easily cut the programming time of this set of programs by a factor of 4. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
Ajay ohri wrote: Sometimes for the sake of simplicity, SAS coding is created like that. One can use the concatenate function and drag and drop in an simple excel sheet for creating elaborate SAS code like the one mentioned and without any time at all. A system that requires Excel for its success is not a complete system. There are multiple ways to do this in SAS , much better and similarly in R There are many areas that SAS programmers would find R a bit not so useful ---example the equivalence of proc logistic for creating a logistic model. Really? Try this in SAS: library(Design) f - lrm(death ~ rcs(age,5)*sex) anova(f) # get test of nonlinearity of interactions among other things nomogram(f) # depict model graphically The restricted cubic spline in age, i.e., assuming the age relationship is smooth but not much else, is very easy to code in R. There are many other automatic transformations available. The lack of generality of the SAS language makes many SAS users assume linearity for more often than R users do. Also note that PROC LOGISTIC, without invocation of a special option, would make the user believe that older subjects have lower chances of dying, as SAS by default takes the even being predicted to be death=0. Frank On Fri, Feb 27, 2009 at 10:21 AM, Wensui Liu liuwen...@gmail.com mailto:liuwen...@gmail.com wrote: Thanks for pointing me to the SAS code, Dr Harrell After reading codes, I have to say that the inefficiency is not related to SAS language itself but the SAS programmer. An experienced SAS programmer won't use much of hard-coding, very adhoc and difficult to maintain. I agree with you that in the SAS code, it is a little too much to evaluate predictions. such complex data step actually can be replaced by simpler iml code. On Thu, Feb 26, 2009 at 5:57 PM, Frank E Harrell Jr f.harr...@vanderbilt.edu mailto:f.harr...@vanderbilt.edu wrote: If anyone wants to see a prime example of how inefficient it is to program in SAS, take a look at the SAS programs provided by the US Agency for Healthcare Research and Quality for risk adjusting and reporting for hospital outcomes at http://www.qualityindicators.ahrq.gov/software.htm . The PSSASP3.SAS program is a prime example. Look at how you do a vector product in the SAS macro language to evaluate predictions from a logistic regression model. I estimate that using R would easily cut the programming time of this set of programs by a factor of 4. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- === WenSui Liu Acquisition Risk, Chase Blog : statcompute.spaces.live.com http://statcompute.spaces.live.com I can calculate the motion of heavenly bodies, but not the madness of people.” -- Isaac Newton === __ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
Gerard M. Keogh wrote: Frank, I can't see the code you mention - Web marshall at work - but I don't think you should be too quick to run down SAS - it's a powerful and flexible language but unfortunately very expensive. Your example mentions doing a vector product in the macro language - this only suggest to me that those people writing the code need a crash course in SAS/IML (the matrix language). SAS is designed to work on records and so is inapproprorriate for matrices - macros are only an efficient code copying device. Doing matrix computations in this way is pretty mad and the code would be impossible never mind the memory problems. SAS recognise that but a lot of SAS users remain familiar with IML. In IML by contrast there are inner, cross and outer products and a raft of other useful methods for matrix work that R users would be familiar with. OLS for example is one line: b = solve(X`X, X`y) ; rss = sqrt(ssq(y - Xb)) ; And to give you a flavour of IML's capabilities I implemented a SAS version of the MARS program in it about 6 or 7 years ago. BTW SPSS also has a matrix language. Gerard But try this: PROC IML; ... some custom user code ... ... loop over j=1 to 10 ... ... PROC GENMOD, output results back to IML ... IML is only a partial solution since it is not integrated with the PROC step. Frank Frank E Harrell Jr f.harr...@vander To bilt.edu R list r-h...@stat.math.ethz.ch Sent by: cc r-help-boun...@r- project.org Subject [R] Inefficiency of SAS Programming 26/02/2009 22:57 If anyone wants to see a prime example of how inefficient it is to program in SAS, take a look at the SAS programs provided by the US Agency for Healthcare Research and Quality for risk adjusting and reporting for hospital outcomes at http://www.qualityindicators.ahrq.gov/software.htm . The PSSASP3.SAS program is a prime example. Look at how you do a vector product in the SAS macro language to evaluate predictions from a logistic regression model. I estimate that using R would easily cut the programming time of this set of programs by a factor of 4. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ** The information transmitted is intended only for the p...{{dropped:15}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
Ajay ohri wrote: I would like to know if we can create a package in which r functions are renamed closer to sas language.doing so will help people familiar to SAS to straight away take to R for their work,thus decreasing the threshold for acceptance - and then get into deeper understanding later. since it is a package it would be optional only for people wanting to try out R from SAS.. Do we have such a package right now..it basically masks R functions to the equivalent function in another language just for user ease /beginners for example creating function for means procmeans-function(x,y) + { summary ( subset(x,select=c(x,y)) + ) creating function for importing csv procimport -function(x,y) + { read.csv( textConnection(x),row.names=y,na.strings= + ) creating function fo describing data procunivariate-function(x) + { summary(x) + ) regards, ajay Ajay, This will generate major confusion among users of all types and be hard to maintain. A better approach is to get Bob Muenchen's excellent book and keep it nearby. Frank www.decisionstats.com http://www.decisionstats.com On Fri, Feb 27, 2009 at 4:27 AM, Frank E Harrell Jr f.harr...@vanderbilt.edu mailto:f.harr...@vanderbilt.edu wrote: If anyone wants to see a prime example of how inefficient it is to program in SAS, take a look at the SAS programs provided by the US Agency for Healthcare Research and Quality for risk adjusting and reporting for hospital outcomes at http://www.qualityindicators.ahrq.gov/software.htm . The PSSASP3.SAS program is a prime example. Look at how you do a vector product in the SAS macro language to evaluate predictions from a logistic regression model. I estimate that using R would easily cut the programming time of this set of programs by a factor of 4. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
Yes Frank, I accept your point but nevertheless IML is the proper place for matrix work in SAS - mixing macro-level logic and computation is another question - R is certainly more seemless in this respect. Gerard Frank E Harrell Jr f.harr...@vander To bilt.edu Gerard M. Keogh gmke...@justice.ie 27/02/2009 13:55 cc R list r-h...@stat.math.ethz.ch, r-help-boun...@r-project.org Subject Re: [R] Inefficiency of SAS Programming Gerard M. Keogh wrote: Frank, I can't see the code you mention - Web marshall at work - but I don't think you should be too quick to run down SAS - it's a powerful and flexible language but unfortunately very expensive. Your example mentions doing a vector product in the macro language - this only suggest to me that those people writing the code need a crash course in SAS/IML (the matrix language). SAS is designed to work on records and so is inapproprorriate for matrices - macros are only an efficient code copying device. Doing matrix computations in this way is pretty mad and the code would be impossible never mind the memory problems. SAS recognise that but a lot of SAS users remain familiar with IML. In IML by contrast there are inner, cross and outer products and a raft of other useful methods for matrix work that R users would be familiar with. OLS for example is one line: b = solve(X`X, X`y) ; rss = sqrt(ssq(y - Xb)) ; And to give you a flavour of IML's capabilities I implemented a SAS version of the MARS program in it about 6 or 7 years ago. BTW SPSS also has a matrix language. Gerard But try this: PROC IML; ... some custom user code ... ... loop over j=1 to 10 ... ... PROC GENMOD, output results back to IML ... IML is only a partial solution since it is not integrated with the PROC step. Frank Frank E Harrell Jr f.harr...@vander To bilt.edu R list r-h...@stat.math.ethz.ch Sent by: cc r-help-boun...@r- project.org Subject [R] Inefficiency of SAS Programming 26/02/2009 22:57 If anyone wants to see a prime example of how inefficient it is to program in SAS, take a look at the SAS programs provided by the US Agency for Healthcare Research and Quality for risk adjusting and reporting for hospital outcomes at http://www.qualityindicators.ahrq.gov/software.htm . The PSSASP3.SAS program is a prime example. Look at how you do a vector product in the SAS macro language to evaluate predictions from a logistic regression model. I estimate that using R would easily cut the programming time of this set of programs by a factor of 4. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ** The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer. It is the policy of the Department of Justice, Equality and Law Reform and the Agencies and Offices using its IT services to disallow the sending
Re: [R] Inefficiency of SAS Programming
on 02/27/2009 07:57 AM Frank E Harrell Jr wrote: Ajay ohri wrote: I would like to know if we can create a package in which r functions are renamed closer to sas language.doing so will help people familiar to SAS to straight away take to R for their work,thus decreasing the threshold for acceptance - and then get into deeper understanding later. since it is a package it would be optional only for people wanting to try out R from SAS.. Do we have such a package right now..it basically masks R functions to the equivalent function in another language just for user ease /beginners for example creating function for means procmeans-function(x,y) + { summary ( subset(x,select=c(x,y)) + ) creating function for importing csv procimport -function(x,y) + { read.csv( textConnection(x),row.names=y,na.strings= + ) creating function fo describing data procunivariate-function(x) + { summary(x) + ) regards, ajay Ajay, This will generate major confusion among users of all types and be hard to maintain. A better approach is to get Bob Muenchen's excellent book and keep it nearby. Frank I whole heartedly agree with Frank here. It may be one thing to have a translation process in place based upon some form of logical mapping between the two languages (as Bob's book provides). But is another thing entirely to actually start writing functions that provide wrappers modeled on SAS based PROCs. If you do this, then you only serve to obfuscate the fundamental philosophical and functional differences between the two languages and doom a new useR to missing all of R's benefits. They will continue to try to figure out how to use R based upon their SAS intuition rather than developing a new set of coding and even statistical paradigms. Having been through the SAS to S/R transition myself, having used SAS for much of the 90's and now having used R for over 7 years, I can speak from personal experience and state that the only way to achieve the requisite proficiency with R is immersion therapy. Regards, Marc Schwartz __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
I've actually used AHRQ's software to create Inpatient Quality Indicator reports. I can confirm pretty much what we already know; it is inefficient. Running on about 1.8 - 2 million cases, it would take just about a whole day to run the entire process from start to finish. That isn't all processing time and includes some time for the analyst to check results between substeps, but I still knew that my day was full when I was working on IQI reports. To be fair though, there are a lot of other factors (beside efficiency considerations) that go into AHRQ's program design. First, there are a lot of changes to that software every year. In some cases it is easier and less error prone to hardcode a few points in the data so that it is blatantly obvious what to change next year should another analyst need to do so. Second, the organizations that use this software often require transparency and may not have high level programmers on staff. Writing code so that it is accessible, editable, and interpretable by intermediate level programmers or analysts is a plus. Third, given that IQI reports are often produced on a yearly basis, there's no real need to sacrifice clarity, etc. for efficiency - you're only doing this process once a year. There are other points that could be made, but the main idea is I don't think it's fair to hold this software up, out of context, as an example of SAS's (or even AHRQs) inefficiencies. I agree that SAS syntax is nowhere near as elegant or as powerful as R from a programming standpoint, that's why after 7 years of using SAS I switched to R. But comparing the two at that level is like a racing a Ferrari and a Bentley to see which is the better car. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
I had enrolled in a statistics course this semester, but after the first class, I dropped it because it uses SAS. This thread makes me quite glad. Tom! On Fri, Feb 27, 2009 at 8:48 AM, Frank E Harrell Jr f.harr...@vanderbilt.edu wrote: Wensui Liu wrote: Thanks for pointing me to the SAS code, Dr Harrell After reading codes, I have to say that the inefficiency is not related to SAS language itself but the SAS programmer. An experienced SAS programmer won't use much of hard-coding, very adhoc and difficult to maintain. I agree with you that in the SAS code, it is a little too much to evaluate predictions. such complex data step actually can be replaced by simpler iml code. Agreed that the SAS code could have been much better. I programmed in SAS for 23 years and would have done it much differently. But you will find that the most elegant SAS program re-write will still be a far cry from the elegance of R. Frank On Thu, Feb 26, 2009 at 5:57 PM, Frank E Harrell Jr f.harr...@vanderbilt.edu wrote: If anyone wants to see a prime example of how inefficient it is to program in SAS, take a look at the SAS programs provided by the US Agency for Healthcare Research and Quality for risk adjusting and reporting for hospital outcomes at http://www.qualityindicators.ahrq.gov/software.htm . The PSSASP3.SAS program is a prime example. Look at how you do a vector product in the SAS macro language to evaluate predictions from a logistic regression model. I estimate that using R would easily cut the programming time of this set of programs by a factor of 4. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
Immersion therapy can be done at a later stage after the newly baptized R corporate user is happy with the fact that he can do most of his legacy code in R easily now . I have treading water in the immersion for over a year now. Most SAS consultants and corporate users are eager to try out R ..but they are scared of immersion especially in these cut back times ...so this could be a middle step...let me go ahead and create the wrapper SAS package as a middle ware between r and sas .. and we will let the invisible hands of free market decide :)) regards, ajay www.decisionstats.com I am not a Marxist. Karl Marx http://www.brainyquote.com/quotes/quotes/k/karlmarx131048.html On Fri, Feb 27, 2009 at 8:01 PM, Marc Schwartz marc_schwa...@comcast.netwrote: on 02/27/2009 07:57 AM Frank E Harrell Jr wrote: Ajay ohri wrote: I would like to know if we can create a package in which r functions are renamed closer to sas language.doing so will help people familiar to SAS to straight away take to R for their work,thus decreasing the threshold for acceptance - and then get into deeper understanding later. since it is a package it would be optional only for people wanting to try out R from SAS.. Do we have such a package right now..it basically masks R functions to the equivalent function in another language just for user ease /beginners for example creating function for means procmeans-function(x,y) + { summary ( subset(x,select=c(x,y)) + ) creating function for importing csv procimport -function(x,y) + { read.csv( textConnection(x),row.names=y,na.strings= + ) creating function fo describing data procunivariate-function(x) + { summary(x) + ) regards, ajay Ajay, This will generate major confusion among users of all types and be hard to maintain. A better approach is to get Bob Muenchen's excellent book and keep it nearby. Frank I whole heartedly agree with Frank here. It may be one thing to have a translation process in place based upon some form of logical mapping between the two languages (as Bob's book provides). But is another thing entirely to actually start writing functions that provide wrappers modeled on SAS based PROCs. If you do this, then you only serve to obfuscate the fundamental philosophical and functional differences between the two languages and doom a new useR to missing all of R's benefits. They will continue to try to figure out how to use R based upon their SAS intuition rather than developing a new set of coding and even statistical paradigms. Having been through the SAS to S/R transition myself, having used SAS for much of the 90's and now having used R for over 7 years, I can speak from personal experience and state that the only way to achieve the requisite proficiency with R is immersion therapy. Regards, Marc Schwartz [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
Ajay ohri wrote: Immersion therapy can be done at a later stage after the newly baptized R corporate user is happy with the fact that he can do most of his legacy code in R easily now . I have treading water in the immersion for over a year now. Most SAS consultants and corporate users are eager to try out R ..but they are scared of immersion especially in these cut back times ...so this could be a middle step...let me go ahead and create the wrapper SAS package as a middle ware between r and sas .. and we will let the invisible hands of free market decide :)) This is futile and will make it more difficult for other R users to help you in the future. As Marc said this is really a bad idea and will backfire. Frank regards, ajay www.decisionstats.com http://www.decisionstats.com I am not a Marxist. Karl Marx http://www.brainyquote.com/quotes/quotes/k/karlmarx131048.html On Fri, Feb 27, 2009 at 8:01 PM, Marc Schwartz marc_schwa...@comcast.net mailto:marc_schwa...@comcast.net wrote: on 02/27/2009 07:57 AM Frank E Harrell Jr wrote: Ajay ohri wrote: I would like to know if we can create a package in which r functions are renamed closer to sas language.doing so will help people familiar to SAS to straight away take to R for their work,thus decreasing the threshold for acceptance - and then get into deeper understanding later. since it is a package it would be optional only for people wanting to try out R from SAS.. Do we have such a package right now..it basically masks R functions to the equivalent function in another language just for user ease /beginners for example creating function for means procmeans-function(x,y) + { summary ( subset(x,select=c(x,y)) + ) creating function for importing csv procimport -function(x,y) + { read.csv( textConnection(x),row.names=y,na.strings= + ) creating function fo describing data procunivariate-function(x) + { summary(x) + ) regards, ajay Ajay, This will generate major confusion among users of all types and be hard to maintain. A better approach is to get Bob Muenchen's excellent book and keep it nearby. Frank I whole heartedly agree with Frank here. It may be one thing to have a translation process in place based upon some form of logical mapping between the two languages (as Bob's book provides). But is another thing entirely to actually start writing functions that provide wrappers modeled on SAS based PROCs. If you do this, then you only serve to obfuscate the fundamental philosophical and functional differences between the two languages and doom a new useR to missing all of R's benefits. They will continue to try to figure out how to use R based upon their SAS intuition rather than developing a new set of coding and even statistical paradigms. Having been through the SAS to S/R transition myself, having used SAS for much of the 90's and now having used R for over 7 years, I can speak from personal experience and state that the only way to achieve the requisite proficiency with R is immersion therapy. Regards, Marc Schwartz -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
Three comments I actually think you can write worse code in R than in SAS: more tools = more scope for innovatively bad ideas. The ability to write bad code should not damm a language. I found almost all of the improvements to the multi-line SAS recode to be regressions, both the SAS and the S suggestions. a. Everyone, even those of you with no SAS backround whatsoever, immediately understood the code. Most of the replacements are obscure. Compilers are very good these days and computers are fast, fewer typed characters != better. b. If I were writing the S code for such an application, it would look much the same. I worked as a programmer in medical research for several years, and one of the things that moved me on to graduate studies in statistics was the realization that doing my best work meant being as UN-clever as possible in my code. Frank's comments imply that he was reading SAS macro code at the moment of peak frustration. And if you want to criticise SAS code, this is the place to look. SAS macro started out as some simple expansions, then got added on to, then added on again, and again, and with no overall blueprint. It is much like the farmhouse of some neighbors of mine growing up: 4 different expansions in 4 eras, and no overall guiding plan. The interior layout was interesting to say the least. I was once a bona fide SAS 'wizard' (and Frank was much better than me), and I can't read the stuff without grinding my teeth. S was once headed down the same road. One of the best things ever with the language was documented in the blue book The New S Language, where Becker et al had the wisdom to scrap the macro processor. Terry Therneau __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
Terry Therneau wrote: Three comments I actually think you can write worse code in R than in SAS: more tools = more scope for innovatively bad ideas. The ability to write bad code should not damm a language. I found almost all of the improvements to the multi-line SAS recode to be regressions, both the SAS and the S suggestions. a. Everyone, even those of you with no SAS backround whatsoever, immediately understood the code. Most of the replacements are obscure. Compilers are very good these days and computers are fast, fewer typed characters != better. b. If I were writing the S code for such an application, it would look much the same. I worked as a programmer in medical research for several years, and one of the things that moved me on to graduate studies in statistics was the realization that doing my best work meant being as UN-clever as possible in my code. If I were writing S code for this it would be dramatically different. I would try to be efficient and elegant but would need to remember to be a teacher at the same time. For example this kind of recode is super efficient and quick to program but would need good comments or a handbook to all of my code: c(cat=1, dog=2, giraffe=3)[animal] But I think the code is quite intuitive once you have used that construct once. There also a lot of factoring of code that could be done as others have pointed out. Frank's comments imply that he was reading SAS macro code at the moment of peak frustration. And if you want to criticise SAS code, this is the place to look. SAS macro started out as some simple expansions, then got added on to, then added on again, and again, and with no overall blueprint. It is much like the farmhouse of some neighbors of mine growing up: 4 different expansions in 4 eras, and no overall guiding plan. The interior layout was interesting to say the least. I was once a bona fide SAS 'wizard' (and Frank was much better than me), and I can't read the stuff without grinding my teeth. S was once headed down the same road. One of the best things ever with the language was documented in the blue book The New S Language, where Becker et al had the wisdom to scrap the macro processor. Well put. I am amazed there hasn't been a revolt among SAS users decades ago. The S approach is also easier to debug one line at a time. Cheers, Frank Terry Therneau -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
Terry's remarks (see below) are well received however, I take issue with one part of his comments. As a long time programmer (in both statistical programming languages and traditional programming languages), I miss the ability to write native-languages in R. While macros can make for difficult to read code, when used properly, they can also make flexible code that, if properly written (including good documentation, which should be a part of any code) can be easy to read. Finally, everyone must remember that SAS code can be difficult to understand or inefficient just as R code can be difficult to understand or inefficient. In the end, both programming systems have their advantages and disadvantage. No programming language is perfect. It is not fair, nor correct to damn one or the other. Accept the fact that some things are more easily and more clearly done in one language, other things are more clearly and more easily done in another language. Let's move on to more important issues, viz. improving R so it is as good as it possibly can be. John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Terry Therneau thern...@mayo.edu 2/27/2009 10:23 AM Three comments I actually think you can write worse code in R than in SAS: more tools = more scope for innovatively bad ideas. The ability to write bad code should not damm a language. I found almost all of the improvements to the multi-line SAS recode to be regressions, both the SAS and the S suggestions. a. Everyone, even those of you with no SAS backround whatsoever, immediately understood the code. Most of the replacements are obscure. Compilers are very good these days and computers are fast, fewer typed characters != better. b. If I were writing the S code for such an application, it would look much the same. I worked as a programmer in medical research for several years, and one of the things that moved me on to graduate studies in statistics was the realization that doing my best work meant being as UN-clever as possible in my code. Frank's comments imply that he was reading SAS macro code at the moment of peak frustration. And if you want to criticise SAS code, this is the place to look. SAS macro started out as some simple expansions, then got added on to, then added on again, and again, and with no overall blueprint. It is much like the farmhouse of some neighbors of mine growing up: 4 different expansions in 4 eras, and no overall guiding plan. The interior layout was interesting to say the least. I was once a bona fide SAS 'wizard' (and Frank was much better than me), and I can't read the stuff without grinding my teeth. S was once headed down the same road. One of the best things ever with the language was documented in the blue book The New S Language, where Becker et al had the wisdom to scrap the macro processor. Terry Therneau __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:6}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
spam me wrote: I've actually used AHRQ's software to create Inpatient Quality Indicator reports. I can confirm pretty much what we already know; it is inefficient. Running on about 1.8 - 2 million cases, it would take just about a whole day to run the entire process from start to finish. That isn't all processing time and includes some time for the analyst to check results between substeps, but I still knew that my day was full when I was working on IQI reports. To be fair though, there are a lot of other factors (beside efficiency considerations) that go into AHRQ's program design. First, there are a lot of changes to that software every year. In some cases it is easier and less error prone to hardcode a few points in the data so that it is blatantly obvious what to change next year should another analyst need to do so. Second, the organizations that use this software often require transparency and may not have high level programmers on staff. Writing code so that it is accessible, editable, and interpretable by intermediate level programmers or analysts is a plus. Third, given that IQI reports are often produced on a yearly basis, there's no real need to sacrifice clarity, etc. for efficiency - you're only doing this process once a year. There are other points that could be made, but the main idea is I don't think it's fair to hold this software up, out of context, as an example of SAS's (or even AHRQs) inefficiencies. I agree that SAS syntax is nowhere near as elegant or as powerful as R from a programming standpoint, that's why after 7 years of using SAS I switched to R. But comparing the two at that level is like a racing a Ferrari and a Bentley to see which is the better car. Dear Anonymous, Nice points. I would just add that it would be better if government-sponsored projects would result in software that could be run without expensive licenses. Thanks Frank [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
John Sorkin wrote: Terry's remarks (see below) are well received however, I take issue with one part of his comments. As a long time programmer (in both statistical programming languages and traditional programming languages), I miss the ability to write native-languages in R. While macros can make for difficult to read code, when used properly, they can also make flexible code that, if properly written (including good documentation, which should be a part of any code) can be easy to read. Finally, everyone must remember that SAS code can be difficult to understand or inefficient just as R code can be difficult to understand or inefficient. In the end, both programming systems have their advantages and disadvantage. No programming language is perfect. It is not fair, nor correct to damn one or the other. Accept the fact that some things are more easily and more clearly done in one language, other things are more clearly and more easily done in another language. Let's move on to more important issues, viz. improving R so it is as good as it possibly can be. John Nice points John. My only response is that I learned SAS in 1969 and used it intensively until 1991. I wrote some of the first user-contributed SAS procedures (PROCs PCTL, GRAPH, DATACHK, LOGIST, PHGLM) and wrote extensively in the macro language. After using S-Plus for only one month my productivity was far ahead of my productivity using SAS. Frank John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Terry Therneau thern...@mayo.edu 2/27/2009 10:23 AM Three comments I actually think you can write worse code in R than in SAS: more tools = more scope for innovatively bad ideas. The ability to write bad code should not damm a language. I found almost all of the improvements to the multi-line SAS recode to be regressions, both the SAS and the S suggestions. a. Everyone, even those of you with no SAS backround whatsoever, immediately understood the code. Most of the replacements are obscure. Compilers are very good these days and computers are fast, fewer typed characters != better. b. If I were writing the S code for such an application, it would look much the same. I worked as a programmer in medical research for several years, and one of the things that moved me on to graduate studies in statistics was the realization that doing my best work meant being as UN-clever as possible in my code. Frank's comments imply that he was reading SAS macro code at the moment of peak frustration. And if you want to criticise SAS code, this is the place to look. SAS macro started out as some simple expansions, then got added on to, then added on again, and again, and with no overall blueprint. It is much like the farmhouse of some neighbors of mine growing up: 4 different expansions in 4 eras, and no overall guiding plan. The interior layout was interesting to say the least. I was once a bona fide SAS 'wizard' (and Frank was much better than me), and I can't read the stuff without grinding my teeth. S was once headed down the same road. One of the best things ever with the language was documented in the blue book The New S Language, where Becker et al had the wisdom to scrap the macro processor. Terry Therneau __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Confidentiality Statement: This email message, including any attachments, is for ...{{dropped:14}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
A further example of software pricing dynamics is the complete lack of awareness of WPS , a UK based software which is basically a base SAS clone with all the features of SAS ( coding read ,write and data read /write) and priced only at 660$ per desktop and 1400$ for server licenses ..very very cheap compared to SAS Base..and it has a Bridge to R for higher level statistics... You would think a corporate user would not have any hesitation to switch to a clone software priced at 10 % ... yet there are hardly any takers for it..in the federal government... :)) people worried about their government's spending should use the new website http://www.recovery.gov/?q=content/contact it is supposed to chronicle this and it would be a good test and control for the Web 2.0 initiatives.. On Fri, Feb 27, 2009 at 11:18 PM, Frank E Harrell Jr f.harr...@vanderbilt.edu wrote: spam me wrote: I've actually used AHRQ's software to create Inpatient Quality Indicator reports. I can confirm pretty much what we already know; it is inefficient. Running on about 1.8 - 2 million cases, it would take just about a whole day to run the entire process from start to finish. That isn't all processing time and includes some time for the analyst to check results between substeps, but I still knew that my day was full when I was working on IQI reports. To be fair though, there are a lot of other factors (beside efficiency considerations) that go into AHRQ's program design. First, there are a lot of changes to that software every year. In some cases it is easier and less error prone to hardcode a few points in the data so that it is blatantly obvious what to change next year should another analyst need to do so. Second, the organizations that use this software often require transparency and may not have high level programmers on staff. Writing code so that it is accessible, editable, and interpretable by intermediate level programmers or analysts is a plus. Third, given that IQI reports are often produced on a yearly basis, there's no real need to sacrifice clarity, etc. for efficiency - you're only doing this process once a year. There are other points that could be made, but the main idea is I don't think it's fair to hold this software up, out of context, as an example of SAS's (or even AHRQs) inefficiencies. I agree that SAS syntax is nowhere near as elegant or as powerful as R from a programming standpoint, that's why after 7 years of using SAS I switched to R. But comparing the two at that level is like a racing a Ferrari and a Bentley to see which is the better car. Dear Anonymous, Nice points. I would just add that it would be better if government-sponsored projects would result in software that could be run without expensive licenses. Thanks Frank [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
Frank, A programming language's efficience is a function of several items, including what you are trying to program. Without using SAS proc IML, I have found that it is more efficient to code algorithms (e.g. a least squares linear regression) using R than SAS; we all know that matrix notation leads to more compact syntax than can be had when using non-matrix notation and R implements matrix notation. On the other hand, searching, sub-setting, merging etc. can a times be coded more efficiently, more easily, and in a more easily understood fashion is SAS. I am sure you people who use SAS to set up their datasets and then use R when they are developing an algorithm. Just as French may be a better language to express love, Italian a better language in which to write opera, and English the most efficient language for communication (at least for the last 50 years), so too do both R and SAS have a place in the larger world. John John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Frank E Harrell Jr f.harr...@vanderbilt.edu 2/27/2009 12:52 PM John Sorkin wrote: Terry's remarks (see below) are well received however, I take issue with one part of his comments. As a long time programmer (in both statistical programming languages and traditional programming languages), I miss the ability to write native-languages in R. While macros can make for difficult to read code, when used properly, they can also make flexible code that, if properly written (including good documentation, which should be a part of any code) can be easy to read. Finally, everyone must remember that SAS code can be difficult to understand or inefficient just as R code can be difficult to understand or inefficient. In the end, both programming systems have their advantages and disadvantage. No programming language is perfect. It is not fair, nor correct to damn one or the other. Accept the fact that some things are more easily and more clearly done in one language, other things are more clearly and more easily done in another language. Let's move on to more important issues, viz. improving R so it is as good as it possibly can be. John Nice points John. My only response is that I learned SAS in 1969 and used it intensively until 1991. I wrote some of the first user-contributed SAS procedures (PROCs PCTL, GRAPH, DATACHK, LOGIST, PHGLM) and wrote extensively in the macro language. After using S-Plus for only one month my productivity was far ahead of my productivity using SAS. Frank John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Terry Therneau thern...@mayo.edu 2/27/2009 10:23 AM Three comments I actually think you can write worse code in R than in SAS: more tools = more scope for innovatively bad ideas. The ability to write bad code should not damm a language. I found almost all of the improvements to the multi-line SAS recode to be regressions, both the SAS and the S suggestions. a. Everyone, even those of you with no SAS backround whatsoever, immediately understood the code. Most of the replacements are obscure. Compilers are very good these days and computers are fast, fewer typed characters != better. b. If I were writing the S code for such an application, it would look much the same. I worked as a programmer in medical research for several years, and one of the things that moved me on to graduate studies in statistics was the realization that doing my best work meant being as UN-clever as possible in my code. Frank's comments imply that he was reading SAS macro code at the moment of peak frustration. And if you want to criticise SAS code, this is the place to look. SAS macro started out as some simple expansions, then got added on to, then added on again, and again, and with no overall blueprint. It is much like the farmhouse of some neighbors of mine growing up: 4 different expansions in 4 eras, and no overall guiding plan. The interior layout was interesting to say the least. I was once a bona fide SAS 'wizard' (and Frank was much better than me), and I can't read the stuff without grinding my teeth. S was once headed down the same road. One of the best things ever with the language was documented in the blue book The New S Language, where Becker et al had the wisdom to scrap the macro processor. Terry Therneau
Re: [R] Inefficiency of SAS Programming
My apologies, this obviously doubles as my for registration purposes account and so I don't often send from it - I was not intentionally being so secretive : ) At any rate, I completely agree, but of course it's a reciprocal relationship. The software is written in SAS because that's what the organizations use, the organizations use SAS because that's what the programs are written in... For better or worse, SAS's integration in big bureaucracies is the main thing that keeps it competitive in the marketplace and viable. There aren't a lot of other contexts in which their pricing structure would work. Bryan On Fri, Feb 27, 2009 at 12:48 PM, Frank E Harrell Jr f.harr...@vanderbilt.edu wrote: spam me wrote: I've actually used AHRQ's software to create Inpatient Quality Indicator reports. I can confirm pretty much what we already know; it is inefficient. Running on about 1.8 - 2 million cases, it would take just about a whole day to run the entire process from start to finish. That isn't all processing time and includes some time for the analyst to check results between substeps, but I still knew that my day was full when I was working on IQI reports. To be fair though, there are a lot of other factors (beside efficiency considerations) that go into AHRQ's program design. First, there are a lot of changes to that software every year. In some cases it is easier and less error prone to hardcode a few points in the data so that it is blatantly obvious what to change next year should another analyst need to do so. Second, the organizations that use this software often require transparency and may not have high level programmers on staff. Writing code so that it is accessible, editable, and interpretable by intermediate level programmers or analysts is a plus. Third, given that IQI reports are often produced on a yearly basis, there's no real need to sacrifice clarity, etc. for efficiency - you're only doing this process once a year. There are other points that could be made, but the main idea is I don't think it's fair to hold this software up, out of context, as an example of SAS's (or even AHRQs) inefficiencies. I agree that SAS syntax is nowhere near as elegant or as powerful as R from a programming standpoint, that's why after 7 years of using SAS I switched to R. But comparing the two at that level is like a racing a Ferrari and a Bentley to see which is the better car. Dear Anonymous, Nice points. I would just add that it would be better if government-sponsored projects would result in software that could be run without expensive licenses. Thanks Frank [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
Also because no one wants to put their neck out on a chopping block to suggest R without technical support and the like. If you use SAS, there's a cascade of blame available, but it's not immediately available for R. On Fri, Feb 27, 2009 at 10:36 AM, Bryan thespamho...@gmail.com wrote: My apologies, this obviously doubles as my for registration purposes account and so I don't often send from it - I was not intentionally being so secretive : ) At any rate, I completely agree, but of course it's a reciprocal relationship. The software is written in SAS because that's what the organizations use, the organizations use SAS because that's what the programs are written in... For better or worse, SAS's integration in big bureaucracies is the main thing that keeps it competitive in the marketplace and viable. There aren't a lot of other contexts in which their pricing structure would work. Bryan On Fri, Feb 27, 2009 at 12:48 PM, Frank E Harrell Jr f.harr...@vanderbilt.edu wrote: spam me wrote: I've actually used AHRQ's software to create Inpatient Quality Indicator reports. I can confirm pretty much what we already know; it is inefficient. Running on about 1.8 - 2 million cases, it would take just about a whole day to run the entire process from start to finish. That isn't all processing time and includes some time for the analyst to check results between substeps, but I still knew that my day was full when I was working on IQI reports. To be fair though, there are a lot of other factors (beside efficiency considerations) that go into AHRQ's program design. First, there are a lot of changes to that software every year. In some cases it is easier and less error prone to hardcode a few points in the data so that it is blatantly obvious what to change next year should another analyst need to do so. Second, the organizations that use this software often require transparency and may not have high level programmers on staff. Writing code so that it is accessible, editable, and interpretable by intermediate level programmers or analysts is a plus. Third, given that IQI reports are often produced on a yearly basis, there's no real need to sacrifice clarity, etc. for efficiency - you're only doing this process once a year. There are other points that could be made, but the main idea is I don't think it's fair to hold this software up, out of context, as an example of SAS's (or even AHRQs) inefficiencies. I agree that SAS syntax is nowhere near as elegant or as powerful as R from a programming standpoint, that's why after 7 years of using SAS I switched to R. But comparing the two at that level is like a racing a Ferrari and a Bentley to see which is the better car. Dear Anonymous, Nice points. I would just add that it would be better if government-sponsored projects would result in software that could be run without expensive licenses. Thanks Frank [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
But SAS/IML is not part of base SAS, it costs extra, so there is a good chance that a user that has SAS will not be able to run code that uses SAS/IML. I have known of SAS programmers who know IML well that still write matrix/vector tools using macros or proc transpose so that a user without IML can still use the code (the fact that the code that started this thread was found on a website, suggests that it was meant for general use rather than something only used internally where you know what add-ons will be available). Just another way that R makes life easier for both programmer and user. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Gerard M. Keogh Sent: Friday, February 27, 2009 7:19 AM To: Frank E Harrell Jr Cc: r-help-boun...@r-project.org; R list Subject: Re: [R] Inefficiency of SAS Programming Yes Frank, I accept your point but nevertheless IML is the proper place for matrix work in SAS - mixing macro-level logic and computation is another question - R is certainly more seemless in this respect. Gerard Frank E Harrell Jr f.harr...@vander To bilt.edu Gerard M. Keogh gmke...@justice.ie 27/02/2009 13:55 cc R list r- h...@stat.math.ethz.ch, r-help-boun...@r-project.org Subject Re: [R] Inefficiency of SAS Programming Gerard M. Keogh wrote: Frank, I can't see the code you mention - Web marshall at work - but I don't think you should be too quick to run down SAS - it's a powerful and flexible language but unfortunately very expensive. Your example mentions doing a vector product in the macro language - this only suggest to me that those people writing the code need a crash course in SAS/IML (the matrix language). SAS is designed to work on records and so is inapproprorriate for matrices - macros are only an efficient code copying device. Doing matrix computations in this way is pretty mad and the code would be impossible never mind the memory problems. SAS recognise that but a lot of SAS users remain familiar with IML. In IML by contrast there are inner, cross and outer products and a raft of other useful methods for matrix work that R users would be familiar with. OLS for example is one line: b = solve(X`X, X`y) ; rss = sqrt(ssq(y - Xb)) ; And to give you a flavour of IML's capabilities I implemented a SAS version of the MARS program in it about 6 or 7 years ago. BTW SPSS also has a matrix language. Gerard But try this: PROC IML; ... some custom user code ... ... loop over j=1 to 10 ... ... PROC GENMOD, output results back to IML ... IML is only a partial solution since it is not integrated with the PROC step. Frank Frank E Harrell Jr f.harr...@vander To bilt.edu R list r- h...@stat.math.ethz.ch Sent by: cc r-help-boun...@r- project.org Subject [R] Inefficiency of SAS Programming 26/02/2009 22:57 If anyone wants to see a prime example of how inefficient it is to program in SAS, take a look at the SAS programs provided by the US Agency for Healthcare Research and Quality for risk adjusting and reporting for hospital outcomes at http://www.qualityindicators.ahrq.gov/software.htm . The PSSASP3.SAS program is a prime example. Look at how you do a vector product in the SAS macro language to evaluate predictions from a logistic regression model. I estimate that using R would easily cut the programming time of this set of programs by a factor of 4. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. *** *** The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information
Re: [R] Inefficiency of SAS Programming
John Sorkin wrote: Frank, A programming language's efficience is a function of several items, including what you are trying to program. Without using SAS proc IML, I have found that it is more efficient to code algorithms (e.g. a least squares linear regression) using R than SAS; we all know that matrix notation leads to more compact syntax than can be had when using non-matrix notation and R implements matrix notation. On the other hand, searching, sub-setting, merging etc. can a times be coded more efficiently, more easily, and in a more easily understood fashion is SAS. I am sure you people who use SAS to set up their datasets and then use R when they are developing an algorithm. Just as French may be a better language to express love, Italian a better language in which to write opera, and English the most efficient language for communication (at least for the last 50 years), so too do both R and SAS have a place in the larger world. John John I'll have to strongly disagree with most of your statement about data manipulation. R is far more powerful, easier to debug dynamically, and concise for merging, reshaping, recoding, etc. But I agree on the easily understood portion of your statement. Cheers Frank John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Frank E Harrell Jr f.harr...@vanderbilt.edu 2/27/2009 12:52 PM John Sorkin wrote: Terry's remarks (see below) are well received however, I take issue with one part of his comments. As a long time programmer (in both statistical programming languages and traditional programming languages), I miss the ability to write native-languages in R. While macros can make for difficult to read code, when used properly, they can also make flexible code that, if properly written (including good documentation, which should be a part of any code) can be easy to read. Finally, everyone must remember that SAS code can be difficult to understand or inefficient just as R code can be difficult to understand or inefficient. In the end, both programming systems have their advantages and disadvantage. No programming language is perfect. It is not fair, nor correct to damn one or the other. Accept the fact that some things are more easily and more clearly done in one language, other things are more clearly and more easily done in another language. Let's move on to more important issues, viz. improving R so it is as good as it possibly can be. John Nice points John. My only response is that I learned SAS in 1969 and used it intensively until 1991. I wrote some of the first user-contributed SAS procedures (PROCs PCTL, GRAPH, DATACHK, LOGIST, PHGLM) and wrote extensively in the macro language. After using S-Plus for only one month my productivity was far ahead of my productivity using SAS. Frank John David Sorkin M.D., Ph.D. Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Terry Therneau thern...@mayo.edu 2/27/2009 10:23 AM Three comments I actually think you can write worse code in R than in SAS: more tools = more scope for innovatively bad ideas. The ability to write bad code should not damm a language. I found almost all of the improvements to the multi-line SAS recode to be regressions, both the SAS and the S suggestions. a. Everyone, even those of you with no SAS backround whatsoever, immediately understood the code. Most of the replacements are obscure. Compilers are very good these days and computers are fast, fewer typed characters != better. b. If I were writing the S code for such an application, it would look much the same. I worked as a programmer in medical research for several years, and one of the things that moved me on to graduate studies in statistics was the realization that doing my best work meant being as UN-clever as possible in my code. Frank's comments imply that he was reading SAS macro code at the moment of peak frustration. And if you want to criticise SAS code, this is the place to look. SAS macro started out as some simple expansions, then got added on to, then added on again, and again, and with no overall blueprint. It is much like the farmhouse of some neighbors of mine growing up: 4 different expansions in 4 eras, and no overall guiding plan. The interior layout was interesting to say the least. I was once a bona fide SAS 'wizard' (and Frank was much better than me), and I can't read the stuff without grinding my teeth. S was once
Re: [R] Inefficiency of SAS Programming
On Fri, Feb 27, 2009 at 8:53 AM, Frank E Harrell Jr f.harr...@vanderbilt.edu wrote: Ajay ohri wrote: Sometimes for the sake of simplicity, SAS coding is created like that. One can use the concatenate function and drag and drop in an simple excel sheet for creating elaborate SAS code like the one mentioned and without any time at all. A system that requires Excel for its success is not a complete system. To be fair R depends on perl (although this dependence seems to be decreasing lately and possibly will be eliminated), latex and a bunch of unix tools. Developing GUIs depends on tcl/tk or other external system and developing fast code can require that some of it be written in C. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
Frank, I couldn't locate the program you mentioned. doyou mind being more specific? could you please point me to the file? i am just curious. thanks. On Thu, Feb 26, 2009 at 5:57 PM, Frank E Harrell Jr f.harr...@vanderbilt.edu wrote: If anyone wants to see a prime example of how inefficient it is to program in SAS, take a look at the SAS programs provided by the US Agency for Healthcare Research and Quality for risk adjusting and reporting for hospital outcomes at http://www.qualityindicators.ahrq.gov/software.htm . The PSSASP3.SAS program is a prime example. Look at how you do a vector product in the SAS macro language to evaluate predictions from a logistic regression model. I estimate that using R would easily cut the programming time of this set of programs by a factor of 4. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- === WenSui Liu Acquisition Risk, Chase Blog : statcompute.spaces.live.com I can calculate the motion of heavenly bodies, but not the madness of people.” -- Isaac Newton === __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
2009/2/26 Frank E Harrell Jr f.harr...@vanderbilt.edu: If anyone wants to see a prime example of how inefficient it is to program in SAS, take a look at the SAS programs provided by the US Agency for Healthcare Research and Quality for risk adjusting and reporting for hospital outcomes at http://www.qualityindicators.ahrq.gov/software.htm . The PSSASP3.SAS program is a prime example. Look at how you do a vector product in the SAS macro language to evaluate predictions from a logistic regression model. I estimate that using R would easily cut the programming time of this set of programs by a factor of 4. Plenty of examples ripe for sending to www.thedailywtf.com there. Like this: IF N. = 1 THEN SUB_N = 1; IF N. = 3 THEN SUB_N = 2; IF N. = 4 THEN SUB_N = 3; IF N. = 6 THEN SUB_N = 4; IF N. = 7 THEN SUB_N = 5; IF N. = 8 THEN SUB_N = 6; IF N. = 9 THEN SUB_N = 7; IF N. = 10 THEN SUB_N = 8; IF N. = 11 THEN SUB_N = 9; IF N. = 12 THEN SUB_N = 10; IF N. = 13 THEN SUB_N = 11; IF N. = 14 THEN SUB_N = 12; IF N. = 15 THEN SUB_N = 13; IF N. = 17 THEN SUB_N = 14; IF N. = 18 THEN SUB_N = 15; IF N. = 19 THEN SUB_N = 16; Of course it's possible to write code like that in any language, it just looks worse when it's in ALL CAPS and written in a style that looks like the 1980s and onward never happened. The question is whether it's possible to write this better in SAS. Most of us on this list could write it in R in a better way. Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
Barry Rowlingson wrote: 2009/2/26 Frank E Harrell Jr f.harr...@vanderbilt.edu: If anyone wants to see a prime example of how inefficient it is to program in SAS, take a look at the SAS programs provided by the US Agency for Healthcare Research and Quality for risk adjusting and reporting for hospital outcomes at http://www.qualityindicators.ahrq.gov/software.htm . The PSSASP3.SAS program is a prime example. Look at how you do a vector product in the SAS macro language to evaluate predictions from a logistic regression model. I estimate that using R would easily cut the programming time of this set of programs by a factor of 4. Plenty of examples ripe for sending to www.thedailywtf.com there. Like this: IF N. = 1 THEN SUB_N = 1; IF N. = 3 THEN SUB_N = 2; IF N. = 4 THEN SUB_N = 3; IF N. = 6 THEN SUB_N = 4; IF N. = 7 THEN SUB_N = 5; IF N. = 8 THEN SUB_N = 6; IF N. = 9 THEN SUB_N = 7; IF N. = 10 THEN SUB_N = 8; IF N. = 11 THEN SUB_N = 9; IF N. = 12 THEN SUB_N = 10; IF N. = 13 THEN SUB_N = 11; IF N. = 14 THEN SUB_N = 12; IF N. = 15 THEN SUB_N = 13; IF N. = 17 THEN SUB_N = 14; IF N. = 18 THEN SUB_N = 15; IF N. = 19 THEN SUB_N = 16; Of course it's possible to write code like that in any language, it just looks worse when it's in ALL CAPS and written in a style that looks like the 1980s and onward never happened. The question is whether it's possible to write this better in SAS. Most of us on this list could write it in R in a better way. Presumably, something like IF N. = 1 THEN SUB_N = 1; ELSE IF N. 5 THEN SUB_N = N.-1; ELSE IF N. 16 THEN SUB_N = N.-2; ELSE SUB_N = N.-3; would work, provided that 2, 5, 16 are impossible values. Problem is that it actually makes the code harder to grasp, so experienced SAS programmers go for the dumb but readable code like the above. In R, the cleanest I can think of is subn - match(n, setdiff(1:19, c(2,5,16))) or maybe just subn - match(n, c(1, 3:4, 6:15, 17:19)) although subn - factor(n, levels = c(1, 3:4, 6:15, 17:19)) might be what is really wanted -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
On 26 Feb 2009 at 23:47, Barry Rowlingson wrote: 2009/2/26 Frank E Harrell Jr f.harr...@vanderbilt.edu: If anyone wants to see a prime example of how inefficient it is to program in SAS, take a look at the SAS programs provided by the US Agency for Healthcare Research and Quality for risk adjusting and reporting for hospital outcomes at http://www.qualityindicators.ahrq.gov/software.htm . The PSSASP3.SAS program is a prime example. Look at how you do a vector product in the SAS macro language to evaluate predictions from a logistic regression model. I estimate that using R would easily cut the programming time of this set of programs by a factor of 4. Plenty of examples ripe for sending to www.thedailywtf.com there. Like this: IF N. = 1 THEN SUB_N = 1; IF N. = 3 THEN SUB_N = 2; IF N. = 4 THEN SUB_N = 3; IF N. = 6 THEN SUB_N = 4; IF N. = 7 THEN SUB_N = 5; IF N. = 8 THEN SUB_N = 6; IF N. = 9 THEN SUB_N = 7; IF N. = 10 THEN SUB_N = 8; IF N. = 11 THEN SUB_N = 9; IF N. = 12 THEN SUB_N = 10; IF N. = 13 THEN SUB_N = 11; IF N. = 14 THEN SUB_N = 12; IF N. = 15 THEN SUB_N = 13; IF N. = 17 THEN SUB_N = 14; IF N. = 18 THEN SUB_N = 15; IF N. = 19 THEN SUB_N = 16; Of course it's possible to write code like that in any language, it just looks worse when it's in ALL CAPS and written in a style that looks like the 1980s and onward never happened. The question is whether it's possible to write this better in SAS. Most of us on this list could write it in R in a better way. Oh, it's definitely possible to write better SAS code than that. This should do the trick: Sub_n = input(scan(1 . 2 3 . 4 5 6 7 8 9 10 11 12 13 . 14 15 16, N, ), 2.); among various other ways. But it remains true that certain operations in SAS will be quite inefficient. ---JRG Barry __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. John R. Gleason Associate Professor Syracuse University 430 Huntington Hall Voice: 315-443-3107 Syracuse, NY 13244-2340 USA FAX: 315-443-4085 PGP public key at keyservers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
Sometimes for the sake of simplicity, SAS coding is created like that. One can use the concatenate function and drag and drop in an simple excel sheet for creating elaborate SAS code like the one mentioned and without any time at all. There are multiple ways to do this in SAS , much better and similarly in R There are many areas that SAS programmers would find R a bit not so useful ---example the equivalence of proc logistic for creating a logistic model. On Fri, Feb 27, 2009 at 10:21 AM, Wensui Liu liuwen...@gmail.com wrote: Thanks for pointing me to the SAS code, Dr Harrell After reading codes, I have to say that the inefficiency is not related to SAS language itself but the SAS programmer. An experienced SAS programmer won't use much of hard-coding, very adhoc and difficult to maintain. I agree with you that in the SAS code, it is a little too much to evaluate predictions. such complex data step actually can be replaced by simpler iml code. On Thu, Feb 26, 2009 at 5:57 PM, Frank E Harrell Jr f.harr...@vanderbilt.edu wrote: If anyone wants to see a prime example of how inefficient it is to program in SAS, take a look at the SAS programs provided by the US Agency for Healthcare Research and Quality for risk adjusting and reporting for hospital outcomes at http://www.qualityindicators.ahrq.gov/software.htm. The PSSASP3.SAS program is a prime example. Look at how you do a vector product in the SAS macro language to evaluate predictions from a logistic regression model. I estimate that using R would easily cut the programming time of this set of programs by a factor of 4. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- === WenSui Liu Acquisition Risk, Chase Blog : statcompute.spaces.live.com I can calculate the motion of heavenly bodies, but not the madness of people. -- Isaac Newton === __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inefficiency of SAS Programming
How would this agency be convinced of adopting R code also how would these things work. Regards, Ajay www.decisionstats.com On Fri, Feb 27, 2009 at 4:27 AM, Frank E Harrell Jr f.harr...@vanderbilt.edu wrote: If anyone wants to see a prime example of how inefficient it is to program in SAS, take a look at the SAS programs provided by the US Agency for Healthcare Research and Quality for risk adjusting and reporting for hospital outcomes at http://www.qualityindicators.ahrq.gov/software.htm . The PSSASP3.SAS program is a prime example. Look at how you do a vector product in the SAS macro language to evaluate predictions from a logistic regression model. I estimate that using R would easily cut the programming time of this set of programs by a factor of 4. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.