Re: [R] anti-R vitriol

2004-06-30 Thread TEMPL Matthias
Hi,

I wonder, why SAS should be better in time for reading a data in the system.
I have an example, that shows that R is (sometimes?, always?) faster.

-
Data with 14432 observations and 120 variables.
Time for reading the data:

SAS 8e:
data testt;
set l1.lse01;run;

real time   1.46 seconds
  cpu time0.18 seconds

R 1.9.0:
system.time(read.table(lse01.txt,header=T))
[1] 0.63 0.06 6.22   NA   NA


And this is 2.5 times faster as SAS. 
(SAS reads the .sas7bdat and R the .txt file)

I´m working with SAS (I should working with SAS) and R (I'm going to work with R) on 
the same Computer. In my examples about time series and in something simple but also 
time consuming procedures like summaries,... R is always 2 times faster and sometimes 
30 times faster (with the same results).
I think R is a great software and you can do more things as in SAS.
Some new developments in SAS 9, like COM-server to Excel, some new procedures, better 
graphs, ... is developed and implemented in R for many years ago.
Thanks to the R Development Team!!!

Matthias

 -Ursprüngliche Nachricht-
 Von: Liaw, Andy [mailto:[EMAIL PROTECTED] 
 Gesendet: Dienstag, 29. Juni 2004 20:21
 An: 'Barry Rowlingson'; R-help
 Betreff: RE: [R] anti-R vitriol
 
 
  From: Barry Rowlingson
  
  A colleague is receiving some data from another person. That person
  reads the data in SAS and it takes 30s and uses 64k RAM. 
 That person 
  then tries to read the data in R and it takes 10 minutes and uses a 
  gigabyte of RAM. Person then goes on to say:
  
 It's not that I think SAS is such great software,
 it's not.  But I really hate badly designed
 software.  R is designed by committee.  Worse,
 it's designed by a committee of statisticians.
 They tend to confuse numerical analysis with
 computer science and don't have any idea about
 software development at all.  The result is R.
  
 I do hope [your colleague] won't have to waste time doing
 [this analysis] in an outdated and poorly designed piece
 of software like R.
  
  Would any of the committee like to respond to this? Or
  shall we just 
  slap our collective forehead and wonder how someone could get 
  such a view?
  
  Barry
  
 
 My $0.02:
 
 R, being a flexible programming language, has an amazing 
 ability to cope with people's laziness/ignorance/inelegance, 
 but it comes at a (sometimes
 hefty) price.  While there is no specifics on the situation 
 leading to the person's comments, here's one (not as extreme) 
 example that I happen to come across today:
 
  system.time(spam - read.table(data_dmc2003_train.txt,
 + header=T, 
 + colClasses=c(rep(numeric, 833), 
 +  character)))
 [1] 15.92  0.09 16.80NANA
  system.time(spam - read.table(data_dmc2003_train.txt, header=T))
 [1] 187.29   0.60 200.19 NA NA
 
 My SAS ability is rather serverely limited, but AFAIK, one 
 needs to specify _all_ variables to be read into a dataset in 
 order to read in the data in SAS.  If one has that 
 information, R can be very efficient as well.  Without that 
 information, one gets nothing in SAS, or just let R does the 
 hard work.
 
 Best,
 Andy
 
 __
 [EMAIL PROTECTED] mailing list 
 https://www.stat.math.ethz.ch/mailman/listinfo /r-help
 PLEASE 
 do read the posting guide! 
 http://www.R-project.org/posting-guide.html


__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] anti-R vitriol

2004-06-30 Thread John Maindonald
I am curious.  What were the dimensions of this data set?  Did this 
person know use read.table(), or scan().  Did they know about the 
possibility of reading the data one part at a time?

The way that SAS processes the data row by row limits what can be done. 
 It is often possible with scant loss of information, and more 
satisfactory, to work with a subset of the large data set or with 
multiple subsets.  Neither SAS (in my somewhat dated experience of it) 
nor R is entirely satisfactory for this purpose.  But at least in R, 
given a subset that fits so easily into memory that the graphs are not 
masses of black, there are few logistic problems in doing, rapidly and 
interactively, a variety of manipulations and plots, with each new task 
taking advantage of the learning that has gone before.  To do that well 
in the SAS world, it is necessary to use something like JMP or its 
equivalent in one of the newer modules, which process data in a way 
that is not all that different from R.

I have wondered about possibilities for a suite of functions that would 
make it easy to process through R data that is stored in one large data 
set, with a mix of adding a new variable or variables, repeating a 
calculation on successive subsets of the data, producing predictions or 
suchlike for separate subsets, etc. Database connections may be the way 
to go (c.f., the Ripley and Fei Chen paper at ISI 2003), but it might 
also be useful to have a simple set of functions that would handle some 
standard requirements.

John Maindonald.
On 30 Jun 2004, at 8:02 PM, Barry Rowlingson 
[EMAIL PROTECTED] wrote:

A colleague is receiving some data from another person. That person 
reads the data in SAS and it takes 30s and uses 64k RAM. That person 
then tries to read the data in R and it takes 10 minutes and uses a 
gigabyte of RAM. Person then goes on to say:

  It's not that I think SAS is such great software,
  it's not.  But I really hate badly designed
  software.  R is designed by committee.  Worse,
  it's designed by a committee of statisticians.
  They tend to confuse numerical analysis with
  computer science and don't have any idea about
  software development at all.  The result is R.
  I do hope [your colleague] won't have to waste time doing
  [this analysis] in an outdated and poorly designed piece
  of software like R.
Would any of the committee like to respond to this? Or shall we just 
slap our collective forehead and wonder how someone could get such a 
view?

John Maindonald email: [EMAIL PROTECTED]
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Bioinformation Science, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] anti-R vitriol

2004-06-29 Thread Barry Rowlingson
A colleague is receiving some data from another person. That person 
reads the data in SAS and it takes 30s and uses 64k RAM. That person 
then tries to read the data in R and it takes 10 minutes and uses a 
gigabyte of RAM. Person then goes on to say:

  It's not that I think SAS is such great software,
  it's not.  But I really hate badly designed
  software.  R is designed by committee.  Worse,
  it's designed by a committee of statisticians.
  They tend to confuse numerical analysis with
  computer science and don't have any idea about
  software development at all.  The result is R.
  I do hope [your colleague] won't have to waste time doing
  [this analysis] in an outdated and poorly designed piece
  of software like R.
Would any of the committee like to respond to this? Or shall we just 
slap our collective forehead and wonder how someone could get such a view?


Barry
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] anti-R vitriol

2004-06-29 Thread Berton Gunter
My reaction, as a mere individual user: Of course, one cannot have any idea
what's really going on, so a rational reply to the rant is impossible. But, as
this list repeatedly demonstrates (and as we all have probably experienced), it
is possible to do things foolishly in any software.

Worth noting: John Chambers, the designer of the S language (of which R is an
implementation) won  an ACM computing award (readers -- please correct details of
this citation) for his achievement; so apparently the professional computing
community disagreed with the sentiments expressed in the rant.

Cheers,

--

Bert Gunter

Non-Clinical Biostatistics
Genentech
MS: 240B
Phone: 650-467-7374


The business of the statistician is to catalyze the scientific learning
process.

 -- George E.P. Box

Barry Rowlingson wrote:

 A colleague is receiving some data from another person. That person
 reads the data in SAS and it takes 30s and uses 64k RAM. That person
 then tries to read the data in R and it takes 10 minutes and uses a
 gigabyte of RAM. Person then goes on to say:

It's not that I think SAS is such great software,
it's not.  But I really hate badly designed
software.  R is designed by committee.  Worse,
it's designed by a committee of statisticians.
They tend to confuse numerical analysis with
computer science and don't have any idea about
software development at all.  The result is R.

I do hope [your colleague] won't have to waste time doing
[this analysis] in an outdated and poorly designed piece
of software like R.

 Would any of the committee like to respond to this? Or shall we just
 slap our collective forehead and wonder how someone could get such a view?

 Barry

 __
 [EMAIL PROTECTED] mailing list
 https://www.stat.math.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] anti-R vitriol

2004-06-29 Thread Roger D. Peng
I'm not too concerned about your colleague's view about R.  S/He 
doesn' have to like it, and I don't think anyone actually believes 
that R is designed to make *everyone* happy.  For me, R does about 99% 
of the things I need to do, but sadly, when I need to order a pizza, I 
still have to pick up the telephone.

What worries me more is that your colleague seems to have lost sight 
of the fact that just about all software development involves 
tradeoffs.  Although I've never used SAS, I've used other stat 
packages and it's clear that all of them (including R) have traded in 
some things to get out other things.  An example is R's potentially 
large memory usage, which, one might argue, trades in analyses of very 
large datasets but gets out a very powerful and elegant programming 
language.

Rather than use absolutes, I'd encourage your colleague to be more 
specific.  Rather than and say things like R is poorly designed I'd 
like to hear R is poorly designed for [fill in the blank].  Then we 
can get a better handle on the world in which s/he lives.

-roger
Barry Rowlingson wrote:
A colleague is receiving some data from another person. That person 
reads the data in SAS and it takes 30s and uses 64k RAM. That person 
then tries to read the data in R and it takes 10 minutes and uses a 
gigabyte of RAM. Person then goes on to say:

  It's not that I think SAS is such great software,
  it's not.  But I really hate badly designed
  software.  R is designed by committee.  Worse,
  it's designed by a committee of statisticians.
  They tend to confuse numerical analysis with
  computer science and don't have any idea about
  software development at all.  The result is R.
  I do hope [your colleague] won't have to waste time doing
  [this analysis] in an outdated and poorly designed piece
  of software like R.
Would any of the committee like to respond to this? Or shall we just 
slap our collective forehead and wonder how someone could get such a view?


Barry
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html

--
Roger D. Peng
http://www.biostat.jhsph.edu/~rpeng/
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] anti-R vitriol

2004-06-29 Thread Liaw, Andy
 From: Barry Rowlingson
 
 A colleague is receiving some data from another person. That person 
 reads the data in SAS and it takes 30s and uses 64k RAM. That person 
 then tries to read the data in R and it takes 10 minutes and uses a 
 gigabyte of RAM. Person then goes on to say:
 
It's not that I think SAS is such great software,
it's not.  But I really hate badly designed
software.  R is designed by committee.  Worse,
it's designed by a committee of statisticians.
They tend to confuse numerical analysis with
computer science and don't have any idea about
software development at all.  The result is R.
 
I do hope [your colleague] won't have to waste time doing
[this analysis] in an outdated and poorly designed piece
of software like R.
 
 Would any of the committee like to respond to this? Or 
 shall we just 
 slap our collective forehead and wonder how someone could get 
 such a view?
 
 Barry
 

My $0.02:

R, being a flexible programming language, has an amazing ability to cope
with people's laziness/ignorance/inelegance, but it comes at a (sometimes
hefty) price.  While there is no specifics on the situation leading to the
person's comments, here's one (not as extreme) example that I happen to come
across today:

 system.time(spam - read.table(data_dmc2003_train.txt, 
+ header=T, 
+ colClasses=c(rep(numeric, 833), 
+  character)))
[1] 15.92  0.09 16.80NANA
 system.time(spam - read.table(data_dmc2003_train.txt, header=T))
[1] 187.29   0.60 200.19 NA NA

My SAS ability is rather serverely limited, but AFAIK, one needs to specify
_all_ variables to be read into a dataset in order to read in the data in
SAS.  If one has that information, R can be very efficient as well.  Without
that information, one gets nothing in SAS, or just let R does the hard work.

Best,
Andy

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] anti-R vitriol

2004-06-29 Thread Gabor Grothendieck

Barry Rowlingson B.Rowlingson at lancaster.ac.uk writes:

: A colleague is receiving some data from another person. That person 
: reads the data in SAS and it takes 30s and uses 64k RAM. That person 
: then tries to read the data in R and it takes 10 minutes and uses a 
: gigabyte of RAM. Person then goes on to say:
: 
:It's not that I think SAS is such great software,
:it's not.  But I really hate badly designed
:software.  R is designed by committee.  Worse,
:it's designed by a committee of statisticians.
:They tend to confuse numerical analysis with
:computer science and don't have any idea about
:software development at all.  The result is R.
: 
:I do hope [your colleague] won't have to waste time doing
:[this analysis] in an outdated and poorly designed piece
:of software like R.
: 
: Would any of the committee like to respond to this? Or shall we just 
: slap our collective forehead and wonder how someone could get such a view?

Does he have to repeatedly read in different large datasets or is this 
just a one time requirement?  In the latter case, he could read in the 
data, save it (using the save command), and then just load it (using
the load command) in subsequent sessions.  He would only have to wait 
10 minutes the first time.  If he has that much data its probably a 
large project and a one time hit of 10 minutes versus several days, 
weeks or months of work seems negligible.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[OT] Ordering pizza [was Re: [R] anti-R vitriol]

2004-06-29 Thread Douglas Bates
Roger D. Peng wrote:
I'm not too concerned about your colleague's view about R.  S/He doesn' 
have to like it, and I don't think anyone actually believes that R is 
designed to make *everyone* happy.  For me, R does about 99% of the 
things I need to do, but sadly, when I need to order a pizza, I still 
have to pick up the telephone.
There are several chains of pizzerias in the U.S. that provide for 
Internet-based ordering (e.g. www.papajohnsonline.com) so, with the 
Internet modules in R, it's only a matter of time before you will have a 
pizza-ordering function available.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [OT] Ordering pizza [was Re: [R] anti-R vitriol]

2004-06-29 Thread Rolf Turner

Dang!  You're making me hungry!

cheers,

Rolf Turner

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [OT] Ordering pizza [was Re: [R] anti-R vitriol]

2004-06-29 Thread Prof Brian Ripley
On Tue, 29 Jun 2004, Douglas Bates wrote:

 Roger D. Peng wrote:
 
  I'm not too concerned about your colleague's view about R.  S/He doesn' 
  have to like it, and I don't think anyone actually believes that R is 
  designed to make *everyone* happy.  For me, R does about 99% of the 
  things I need to do, but sadly, when I need to order a pizza, I still 
  have to pick up the telephone.
 
 There are several chains of pizzerias in the U.S. that provide for 
 Internet-based ordering (e.g. www.papajohnsonline.com) so, with the 
 Internet modules in R, it's only a matter of time before you will have a 
 pizza-ordering function available.

Indeed, the GraphApp toolkit (used for the RGui interface under R for
Windows, but Guido forgot to include it) provides one (for use in Sydney,
Australia, we presume as that is where the GraphApp author hails from).
Alternatively, a Padovian has no need of ordering pizzas with both home 
and neighbourhood restaurants 

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] anti-R vitriol

2004-06-29 Thread A.J. Rossini
Barry Rowlingson [EMAIL PROTECTED] writes:

It's not that I think SAS is such great software,
it's not.  But I really hate badly designed
software.  R is designed by committee.  Worse,
it's designed by a committee of statisticians.
They tend to confuse numerical analysis with
computer science and don't have any idea about
software development at all.  The result is R.

They'd probably prefer computer scientists and numerical analysts who
confuse data munging with statistical data analysis, a common problem
in mixed departments...

best,
-tony

-- 
[EMAIL PROTECTED]http://www.analytics.washington.edu/ 
Biomedical and Health Informatics   University of Washington
Biostatistics, SCHARP/HVTN  Fred Hutchinson Cancer Research Center
UW (Tu/Th/F): 206-616-7630 FAX=206-543-3461 | Voicemail is unreliable
FHCRC  (M/W): 206-667-7025 FAX=206-667-4812 | use Email

CONFIDENTIALITY NOTICE: This e-mail message and any attachme...{{dropped}}

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html