Re: [R] R in the NY Times

2009-01-11 Thread Marc Schwartz
on 01/10/2009 01:50 PM Kingsford Jones wrote:
 The reactions to the NYT article have certainly made for some
 interesting reading.
 
 Here are some of the links:
 
 http://overdetermined.net/site/content/new-york-times-article-r
 
 http://jackman.stanford.edu/blog/?p=1053
 
 http://ggorjan.blogspot.com/2009/01/new-york-times-on-r.html
 
 several posts on Andrew Gelman's blog:
 http://www.stat.columbia.edu/~gelman/blog/
 
 http://www.reddit.com/r/programming/comments/7nwgq/the_new_york_times_notices_the_r_programming/
 
 comments here: http://bits.blogs.nytimes.com/2009/01/08/r-you-ready-for-r/
 
 
 It's too bad that SAS has reacted to the negative reactions to their
 NYT quote with more FUD.  The quote that Tony posted is just a
 thinly-veiled jab at R (veiled by a disingenuous we value open
 source veneer).  Perhaps SAS is shooting themselves in the foot with
 their reactions; aren't they making it harder if they should ever
 decide the best thing to do is to embrace R and the philosophies
 behind it?  Four years ago, Marc Schwartz posted interesting comments
 realted to this:
 
 http://tolstoy.newcastle.edu.au/R/help/04/12/9497.html


Thanks for pointing this out Kingsford. The books referenced there are
excellent for providing an understanding of the dynamics that have been
the subject of many of these threads here since the NYT article was
published.

There is a natural tension between leading edge adopters, the main
stream and the laggards. Moore's Crossing the Chasm provides good
insights into this tension and the acceptance of new products and
technology.

Grove's Only the Paranoid Survive shows how individual companies and
even entire industries (think banking and autos today) can suddenly face
an unexpected risk to their survival when they fail to comprehend
marketplace dynamics and take appropriate action.

Microsoft's mis-steps vis-a-vis Vista opened the door for Apple and
Linux to increase their respective marketshare and for open source more
generally (eg. Firefox).

BTW, readers might find this commentary of interest:

Commentary: Create a tech-friendly U.S. government
By Jimmy Wales and Andrea Weckerle

http://www.cnn.com/2009/TECH/01/07/wales.obama.cto/index.html


 On another note, I wonder why in the various conversations there seems
 to be pervasive views that a) the FDA won't accept work done in R, and
 b) SAS is the only way to effectively handle data?


I strongly believe that the comments regarding R and the FDA are overly
negative and pessimistic.

The hurdles to the use of R for clinical trials are shrinking. There has
been substantive activity over the past several years, both internally
at the FDA and within the R community to increase R's acceptance in this
domain.

At the Joint Statistical Meetings in 2006, Sue Bell from the FDA spoke
during a session with a presentation entitled Times 'R' A Changing: FDA
Perspectives on Use of Open Source. A copy of this presentation is
available here:

  http://www.fda.gov/cder/Offices/Biostatistics/Bell.pdf

In 2007, during an FDA committee meeting reviewing the safety profile of
Avandia (Rosiglitazone), the internal FDA meta-analysis performed by Joy
Mele, the FDA statistician, was done using R. A copy of this
presentation is available here:
  http://www.fda.gov/ohrms/dockets/ac/07/slides/2007-4308s1-05-fda-mele.ppt

Given the high profile nature of drug safety issues today, that R was
used for this analysis by the FDA itself speaks volumes.

Also in 2007, at the annual R user meeting at Iowa State University, I
had the pleasure and privilege of Chairing a session on the use of R for
clinical trials. The speakers included Frank Harrell (well known to R
users here), Tony Rossini and David James (Novartis Pharmaceuticals) and
Mat Soukup (FDA statistician). Copies of our presentations are available
here, a little more than half way down the page:

  http://user2007.org/program/

At that meeting, we also introduced a document that has been updated
since then and approved formally by the R Foundation for Statistical
Computing. The document provides guidance for the use of R in the
regulated clinical trials domain, addresses R's compliance with the
relevant regulations (eg. 21 CFR 11) as well as describing the
development, testing and quality processes in place for R, also known as
the Software Development Life Cycle.

That document is available here:

  http://www.r-project.org/doc/R-FDA.pdf

I have heard directly from colleagues in industry that this document has
provided significant value in their internal discussions regarding
implementing the use of R within their respective environments and
assuaging many fears regarding R's use.

Additionally, presentations regarding the use of open source software
and R specifically for clinical trials have been made at DIA and other
industry meetings. This fall, there is a session on the use of R
scheduled for the FDA's Industry Statistics Workshop in Washington, D.C.

For those unfamiliar, I would also 

Re: [R] R in the NY Times

2009-01-10 Thread Tony Breyal
“We have customers who build engines for aircraft. I am happy they are
not using freeware when I get on a jet.”

The lady who made this comment, Anne H. Milley, director of technology
product marketing at SAS, has written a response to try and clarify
what she meant (funilly enough, i got this link from a SAS mate of
mine who is now going to have a look into R for the first time):

http://blogs.sas.com/sascom/index.php?/archives/434-This-post-is-rated-R.html


[quote]
As for open source and my airplane quote …

My remark reflects a key difference between R and SAS, that of
support, reliability, and validation. Customers value SAS for many
things, including our extensive testing, documentation, 24/7 support,
and training. In contrast, the quality of proliferating R packages is
varied and uneven, especially in complex analytical modules. Mistakes
in these packages can lead to misleading results, even for experienced
users.

The airplane comment was meant to point out this key difference. Not
to condemn open source. In fact, SAS values open-source software. Our
software runs on Linux. We use some open-source tools in development.
And we plan to embrace open source further in the future.

The world has many complex problems. We advocate approaches based on
science, on analysis to address these problems. Making more analytic
methods readily available is a good thing. From SAS; from R; from the
resourceful individuals who innovate with their tools of choice,
regardless of the source.
[end quote]



On 7 Jan, 14:50, Marc Schwartz marc_schwa...@comcast.net wrote:
 on 01/07/2009 08:44 AM Kevin E. Thorpe wrote:



  Zaslavsky, Alan M. wrote:
  This article is accompanied by nice pictures of Robert and Ross.

  Data Analysts Captivated by Power of R
 http://www.nytimes.com/2009/01/07/technology/business-computing/07pro...

  January 7, 2009 Data Analysts Captivated by R’s Power By ASHLEE VANCE

  SAS says it has noticed R’s rising popularity at universities,
  despite educational discounts on its own software, but it dismisses
  the technology as being of interest to a limited set of people
  working on very hard tasks.

  “I think it addresses a niche market for high-end data analysts that
  want free, readily available code, said Anne H. Milley, director of
  technology product marketing at SAS. She adds, “We have customers who
  build engines for aircraft. I am happy they are not using freeware
  when I get on a jet.”

  Thanks for posting.  Does anyone else find the statement by SAS to be
  humourous yet arrogant and short-sighted?

  Kevin

 It is an ignorant comment by a marketing person who has been spoon fed
 her lines...it is also a comment being made from a very defensive and
 insecure posture.

 Congrats to R Core and the R Community. This is yet another sign of R's
 growth and maturity.

 Regards,

 Marc Schwartz

 __
 r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-10 Thread Florian Lengyel
On Sat, Jan 10, 2009 at 8:11 AM, Tony Breyal tony.bre...@googlemail.com wrote:
 We have customers who build engines for aircraft. I am happy they are
 not using freeware when I get on a jet.

 The lady who made this comment, Anne H. Milley, director of technology
 product marketing at SAS, has written a response to try and clarify
 what she meant (funilly enough, i got this link from a SAS mate of
 mine who is now going to have a look into R for the first time):

 http://blogs.sas.com/sascom/index.php?/archives/434-This-post-is-rated-R.html


 [quote]
 As for open source and my airplane quote …

 My remark reflects a key difference between R and SAS, that of
 support, reliability, and validation. Customers value SAS for many
 things, including our extensive testing, documentation, 24/7 support,
 and training. In contrast, the quality of proliferating R packages is
 varied and uneven, especially in complex analytical modules. Mistakes
 in these packages can lead to misleading results, even for experienced
 users.

 The airplane comment was meant to point out this key difference. Not
 to condemn open source. In fact, SAS values open-source software. Our
 software runs on Linux. We use some open-source tools in development.
 And we plan to embrace open source further in the future.

 The world has many complex problems. We advocate approaches based on
 science, on analysis to address these problems. Making more analytic
 methods readily available is a good thing. From SAS; from R; from the
 resourceful individuals who innovate with their tools of choice,
 regardless of the source.
 [end quote]



Ms. Milley mischaracterizes her remark about the relative
unreliability of freeware
as if she had employed the termopen source.

David A. Wheeler's Why Open Source Software / Free Software (OSS/FS,
FLOSS, or FOSS)? Look at the Numbers! provides quantitative measures
for evaluating open source software, including
market share, reliability, performance, scalability, security, and
total cost of ownership. With respect to
the reliability of open source software, Wheeler writes, There are a
lot of anecdotal stories that OSS/FS is more reliable, but finally
there is quantitative data confirming that mature OSS/FS programs are
often more reliable [than equivalent proprietary software programs].
Wheeler lists among his sources the Fuzz Report
http://pages.cs.wisc.edu/~bart/fuzz/fuzz.html .

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-10 Thread Barry Rowlingson
2009/1/10 Tony Breyal tony.bre...@googlemail.com:

 [SAS marketroid quote]
 In fact, SAS values open-source software.

 But clearly not enough to open-source SAS itself. It would seem that
SAS values _other_people's_ open source.

 If SAS was open source and free, then SAS would collect on all the
other things Customers value SAS for - support, testing, training,
docs, etc etc. And there would be a lot more people using it.

 Another quote: We advocate approaches based on science - closed
source is closed knowledge and is nearer alchemy than science. I may
use proprietary software for video editing or music production, but
when it comes to science, it's got to be open.

Barru

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-10 Thread Ajay ohri
more on the reasons R is bad for you
http://www.decisionstats.com/2009/01/top-ten-rrreasons-r-is-bad-for-you/


On Sun, Jan 11, 2009 at 12:01 AM, Barry Rowlingson 
b.rowling...@lancaster.ac.uk wrote:

 2009/1/10 Tony Breyal tony.bre...@googlemail.com:

  [SAS marketroid quote]
  In fact, SAS values open-source software.

  But clearly not enough to open-source SAS itself. It would seem that
 SAS values _other_people's_ open source.

  If SAS was open source and free, then SAS would collect on all the
 other things Customers value SAS for - support, testing, training,
 docs, etc etc. And there would be a lot more people using it.

  Another quote: We advocate approaches based on science - closed
 source is closed knowledge and is nearer alchemy than science. I may
 use proprietary software for video editing or music production, but
 when it comes to science, it's got to be open.

 Barru

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-10 Thread Bert Gunter

I think the substance of the issue is that the more eyes on code, the fewer
the bugs (assuming a well-designed examination and debugging process is in
place, as is typical for large open source projects like R). By this
(obvious?)criterion, both the remarks about the dangers of proprietary code
and the greater unreliability of R's lesser-used specialty packages, which
by their nature tend to be less carefully perused, are valid.

Perhaps an argument is that certain code might not get written at all if it
were not proprietary. Device drivers might be an example. But possibly other
than that, it does seem like SAS needs to reconsider their marketing
strategy and advertising claims.

Anecdotal remark: I orginally moved from S-Plus to R because R provided
**better** documentation, support, and had fewer bugs, which were more
rapidly fixed when found. One of my smarter investment choices.

Cheers to all,
Bert Gunter

 





-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Barry Rowlingson
Sent: Saturday, January 10, 2009 10:32 AM
To: Tony Breyal
Cc: r-help@r-project.org
Subject: Re: [R] R in the NY Times

2009/1/10 Tony Breyal tony.bre...@googlemail.com:

 [SAS marketroid quote]
 In fact, SAS values open-source software.

 But clearly not enough to open-source SAS itself. It would seem that
SAS values _other_people's_ open source.

 If SAS was open source and free, then SAS would collect on all the
other things Customers value SAS for - support, testing, training,
docs, etc etc. And there would be a lot more people using it.

 Another quote: We advocate approaches based on science - closed
source is closed knowledge and is nearer alchemy than science. I may
use proprietary software for video editing or music production, but
when it comes to science, it's got to be open.

Barru

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-10 Thread Kingsford Jones
The reactions to the NYT article have certainly made for some
interesting reading.

Here are some of the links:

http://overdetermined.net/site/content/new-york-times-article-r

http://jackman.stanford.edu/blog/?p=1053

http://ggorjan.blogspot.com/2009/01/new-york-times-on-r.html

several posts on Andrew Gelman's blog:
http://www.stat.columbia.edu/~gelman/blog/

http://www.reddit.com/r/programming/comments/7nwgq/the_new_york_times_notices_the_r_programming/

comments here: http://bits.blogs.nytimes.com/2009/01/08/r-you-ready-for-r/


It's too bad that SAS has reacted to the negative reactions to their
NYT quote with more FUD.  The quote that Tony posted is just a
thinly-veiled jab at R (veiled by a disingenuous we value open
source veneer).  Perhaps SAS is shooting themselves in the foot with
their reactions; aren't they making it harder if they should ever
decide the best thing to do is to embrace R and the philosophies
behind it?  Four years ago, Marc Schwartz posted interesting comments
realted to this:

http://tolstoy.newcastle.edu.au/R/help/04/12/9497.html


On another note, I wonder why in the various conversations there seems
to be pervasive views that a) the FDA won't accept work done in R, and
b) SAS is the only way to effectively handle data?


best,

Kingsford Jones







 On 7 Jan, 14:50, Marc Schwartz marc_schwa...@comcast.net wrote:
 on 01/07/2009 08:44 AM Kevin E. Thorpe wrote:



  Zaslavsky, Alan M. wrote:
  This article is accompanied by nice pictures of Robert and Ross.

  Data Analysts Captivated by Power of R
 http://www.nytimes.com/2009/01/07/technology/business-computing/07pro...

  January 7, 2009 Data Analysts Captivated by R's Power By ASHLEE VANCE

  SAS says it has noticed R's rising popularity at universities,
  despite educational discounts on its own software, but it dismisses
  the technology as being of interest to a limited set of people
  working on very hard tasks.

  I think it addresses a niche market for high-end data analysts that
  want free, readily available code, said Anne H. Milley, director of
  technology product marketing at SAS. She adds, We have customers who
  build engines for aircraft. I am happy they are not using freeware
  when I get on a jet.

  Thanks for posting.  Does anyone else find the statement by SAS to be
  humourous yet arrogant and short-sighted?

  Kevin

 It is an ignorant comment by a marketing person who has been spoon fed
 her lines...it is also a comment being made from a very defensive and
 insecure posture.

 Congrats to R Core and the R Community. This is yet another sign of R's
 growth and maturity.

 Regards,

 Marc Schwartz

 __
 r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-10 Thread Johannes Huesing
Bert Gunter gunter.ber...@gene.com [Sat, Jan 10, 2009 at 08:31:03PM CET]:
[...]
 Perhaps an argument is that certain code might not get written at all if it
 were not proprietary. Device drivers might be an example. 

Device drivers are not an example. Linux is ubiquitous _despite_ device 
manufacturers being secretive about their protocols and interfaces. There's
a whole lot of people out there who seem to take pride, if not joy, in 
reengineering. At the moment I am profiting immensely from the gpsbabel
tool, which translates readily between all different GPS-related formats,
closed or documented.

-- 
Johannes Hüsing   There is something fascinating about science. 
  One gets such wholesale returns of conjecture 
mailto:johan...@huesing.name  from such a trifling investment of fact.  
  
http://derwisch.wikidot.com (Mark Twain, Life on the Mississippi)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-10 Thread Gabor Grothendieck
There do exist device manufacturers who GPL their device drivers, e.g.

http://freshmeat.net/projects/wanpipe/?branch_id=73783release_id=290741

On Sat, Jan 10, 2009 at 2:31 PM, Bert Gunter gunter.ber...@gene.com wrote:

 I think the substance of the issue is that the more eyes on code, the fewer
 the bugs (assuming a well-designed examination and debugging process is in
 place, as is typical for large open source projects like R). By this
 (obvious?)criterion, both the remarks about the dangers of proprietary code
 and the greater unreliability of R's lesser-used specialty packages, which
 by their nature tend to be less carefully perused, are valid.

 Perhaps an argument is that certain code might not get written at all if it
 were not proprietary. Device drivers might be an example. But possibly other
 than that, it does seem like SAS needs to reconsider their marketing
 strategy and advertising claims.

 Anecdotal remark: I orginally moved from S-Plus to R because R provided
 **better** documentation, support, and had fewer bugs, which were more
 rapidly fixed when found. One of my smarter investment choices.

 Cheers to all,
 Bert Gunter







 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
 Behalf Of Barry Rowlingson
 Sent: Saturday, January 10, 2009 10:32 AM
 To: Tony Breyal
 Cc: r-help@r-project.org
 Subject: Re: [R] R in the NY Times

 2009/1/10 Tony Breyal tony.bre...@googlemail.com:

 [SAS marketroid quote]
 In fact, SAS values open-source software.

  But clearly not enough to open-source SAS itself. It would seem that
 SAS values _other_people's_ open source.

  If SAS was open source and free, then SAS would collect on all the
 other things Customers value SAS for - support, testing, training,
 docs, etc etc. And there would be a lot more people using it.

  Another quote: We advocate approaches based on science - closed
 source is closed knowledge and is nearer alchemy than science. I may
 use proprietary software for video editing or music production, but
 when it comes to science, it's got to be open.

 Barru

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-08 Thread Marc Schwartz
on 01/07/2009 09:47 PM Gabor Grothendieck wrote:
 On Wed, Jan 7, 2009 at 10:26 PM, Dirk Eddelbuettel e...@debian.org wrote:
 On 7 January 2009 at 18:24, Gabor Grothendieck wrote:
 | By running the code below we see that the:
 | - sum of the three seems to be rising at a constant rate
 | - S is declining
 | - SAS and R are rising
 | - R is rising the fastest through its completed its phase
 | of highest growth which ended around 2004

 I wonder whether we need to account for traffic on all the additional r-sig-*
 mailing lists ?

 Of the handful that I follow, some seem to have taken traffic from r-help.
 This could account for (at least parts of) the apparent traffic growth
 slowdown since 2004 as many of these added lists appeared only in the last
 few years.

 
 Good observation.  It would be interesting to combine the data from all
 the lists to see what the effect is.

Agreed.

You can use the basic framework of the R-Help code that I posted
yesterday to do that.

The key gotcha is that some of the list archives have the posts stored
on a per calendar quarter basis, not monthly. At least one has a mix.
This seems to be somewhat dependent upon list volume, though that is not
a consistent factor.

Thus, you would have to review each archive individually and adjust the
archive URL's in the code accordingly.

You would also see the impact on the subsequent aggregation of the data,
since the monthly time series based analyses (as opposed to yearly) will
have to be adjusted, given the differing granularity of the data.

HTH,

Marc

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-08 Thread Max Kuhn
More commentary on Slashdot:

 http://developers.slashdot.org/article.pl?sid=09/01/07/2316227

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-08 Thread Doran, Harold
The open-source mentality is invaluable, as most on this list know. That
is what keeps the R evolution progressing at a pace that SAS cannot keep
up with. 

On a side note (a very side note), I am a zealot for an exercise program
called Crossfit. Crossfit has adopted the same open-source mentality as
found in the Linux model and has grown into the most valuable fitness
and strength training program on the planet. There is an online journal
(called crossfit journal)
http://library.crossfit.com/free/pdf/CrossFitJournal-Budding_Retrospecti
ve.pdf that lists the three components of the linux open-source model:

The Linux development model:
* Release early and often
* Delegate everything you can
* Be open to the point of promiscuity

Crossfit then followed with its own open-source principles:

The CrossFit development model:
* Release early and often
- Daily!
* Delegate everything you can
- Meet the experts from the realms of climbing, lifting, swimming,
gymnastics, fighting, you name it.
* Be open to the point of promiscuity
- Read the WOD weblog comments.
- Check out the discussion board.
- See photos of athletes puking!

The point being, it is not the program itself that is amazing, but the
people that have made serious contributions to it that make it so. In
the same vein, R is only a representation of the many, many valuable
talented people who are constantly adding to its functionality because
of its open-source nature. That is, R itself is good, useful etc. But,
it is the people that add to it and help it grow as a scientific tool
that keep it as the lingua franca.







 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of Max Kuhn
 Sent: Thursday, January 08, 2009 10:17 AM
 To: r-help@r-project.org
 Subject: Re: [R] R in the NY Times
 
 More commentary on Slashdot:
 
  http://developers.slashdot.org/article.pl?sid=09/01/07/2316227
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-08 Thread Stas Kolenikov
On 1/7/09, Gabor Grothendieck ggrothendi...@gmail.com wrote:
 Here is the same number of messages/posts data
  for each of S, SAS, R:
  - reworked into a 3 column ts class time series
  - with Jan 2009 removed since its not complete
  - leading and trailing NA rows removed

My software of choice is Stata, so here are compatible data from
statalist (using
http://www.hsph.harvard.edu/cgi-bin/lwgate/STATALIST/archives/):

## Statalist traffic
stata - structure(c(
654,574,781, 848, 714, 823,1063,1057,
701,625,909, 799, 941,1052,1013,1269,
868,690,937,1155,1040,1113,1125,1252,
640,649,899, 898,1013,1161, 991,1325,
622,697,726,1102, 818,1077,,1374,
684,548,651, 876, 964, 963,1125,1078,
717,588,943, 923, 885, 892, 986,1200,
728,575,605, 901,1010,1011,1224,1396,
627,605,712, 807,1098, 951, 939,1446,
844,790,970, 940,1001,1283,1231,1509,
776,644,870, 928,1094, 928, 999,1340,
603,512,670, 824, 794, 951, 739,1056
),
.Dim = c(8L, 12L),
.Dimnames = list(c(2001, 2002, 2003, 2004, 2005,
2006, 2007, 2008), c(Jan, Feb, Mar, Apr,
May, Jun, Jul, Aug, Sep, Oct, Nov, Dec)))

The list existed from 1994 or 1996 or so, but the data are only
available from 2001. You'd probably be surprised to find out that
based on the list summaries, the size of Stata world is about half of
SAS on the counts plot; and on the log scale, it shows linear (which
means, exponential) growth throughout the range, while both SAS and R
have been slowing down in the last couple of years (with an
explanation already offered regarding the r-sig-* lists).

Of course overall that's an incorrect comparison, to begin with. The
support systems for all three packages are different: most (US)
universities will have dedicated and well-certified SAS gurus
answering most semicolon questions locally, while r-help would be the
first thing on my mind if I cannot get what I need in the docs. I
would thus expect traffic on r-help will to be heavier relative to the
user base.

Another measure of interest might be the number of contributed
packages. The phrase for R is this: Currently, the CRAN package
repository features 1633 objects including 1625 packages and 8 bundles
containing 34 packages, for a total of 1659 available packages. The
phrase for Stata is this: Statistical Software Components,
Boston College Department of Economics: There are currently 1275 items
in this series, of which 1274 are downloadable
(http://logec.repec.org/scripts/seriesstat.pl?item=repec:boc:bocode).
So programming activity in Stata is about 3/4 of that in R at their
face values (you would probably need to downplay both numbers for
obsolete packages, though). Whether SAS has a unified repository of
user contributed modules with direct counts available, I have no clue.

A really good measure for R will be the total # of the downloads of
r-base for all platforms from all CRAN mirrors (and I would expect
that # can be found from the servers' logs). Given that it is so easy
to download everything nice and clean and up to date, I would doubt
anybody will be distributing CD-ROMs with R install files among
friends and colleagues. SAS (and Stata, and SPSS, and Minitab, and...)
should have their (internal) number of licenses sold (and yes those
come on the disks initially), but those are badly blurred by the
network licenses, and are commercial secrets, anyway.

-- 
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: I use this email account for mailing lists only.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-08 Thread Louis Bajuk-Yorgan

As the product manager for S+, I'd like to comment as well. I think the
burgeoning interest in R demonstrates that there's demand for analytics
to solve real, business-critical problems in a broad spectrum of
companies and roles, and that some of the incumbent analytics offerings,
in particular SAS and SPSS, don't sufficiently meet the growing need for
analytics in many major companies. 

S+ (now TIBCO Spotfire S+) is of course a commercial software package
based on the S language, which was a forerunner of R as mentioned in the
article, and has been widely adopted. It is currently used in a wide
variety of areas, including Life Sciences, Financial Services, and
Utilities, for applications such as speeding the analysis of clinical
trial data, optimizing portfolios, and assessing potential sites for
building wind farms. 

I welcome, respect, and appreciate the vitality, creativity, and sheer
productivity of the R community, and the high quality of statistical
methods the community creates. And, because of the close historical ties
between the two products, it is generally easy to port most R statistics
into the commercial S+ environment, and we have worked to make that
easier in recent releases.  

Once in S+, these analytic methods can be incorporated into intuitive
tools for business decision makers and deployed to automated
environments, using visual workflows, web-based applications (using
standard web services), Spotfire Guided Applications for dynamic visual
analysis, and scalable, event-driven architectures using TIBCO's IT
infrastructure. S+ also provides some unique offerings, such as the
ability to flexibly and efficiently analyze very large data sets. 

In this way, I feel companies can maximize the value of their analytic
investments to make rapid business decisions, whether those analytics
are developed in R or S+. 

Regards,
Lou Bajuk-Yorgan
Sr. Director, Product Management
TIBCO Spotfire Division
lba...@tibco.com

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Douglas Bates
Sent: Wednesday, January 07, 2009 12:58 PM
To: marc_schwa...@comcast.net
Cc: r-help@r-project.org
Subject: Re: [R] R in the NY Times

On Wed, Jan 7, 2009 at 8:50 AM, Marc Schwartz
marc_schwa...@comcast.net wrote:
 on 01/07/2009 08:44 AM Kevin E. Thorpe wrote:
 Zaslavsky, Alan M. wrote:
 This article is accompanied by nice pictures of Robert and Ross.

 Data Analysts Captivated by Power of R 
 http://www.nytimes.com/2009/01/07/technology/business-computing/07pr
 ogram.html



 January 7, 2009 Data Analysts Captivated by R's Power By ASHLEE 
 VANCE


 SAS says it has noticed R's rising popularity at universities, 
 despite educational discounts on its own software, but it dismisses 
 the technology as being of interest to a limited set of people 
 working on very hard tasks.

 I think it addresses a niche market for high-end data analysts that

 want free, readily available code, said Anne H. Milley, director of

 technology product marketing at SAS. She adds, We have customers 
 who build engines for aircraft. I am happy they are not using 
 freeware when I get on a jet.


 Thanks for posting.  Does anyone else find the statement by SAS to be

 humourous yet arrogant and short-sighted?

 Kevin

 It is an ignorant comment by a marketing person who has been spoon fed

 her lines...it is also a comment being made from a very defensive and 
 insecure posture.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-08 Thread Carlos J. Gil Bellosta
On Thu, 2009-01-08 at 10:42 -0600, Stas Kolenikov wrote:
 A really good measure for R will be the total # of the downloads of
 r-base for all platforms from all CRAN mirrors (and I would expect
 that # can be found from the servers' logs). 

Hello,

You obviate here that many of us are downloading R from our Linux
distribution repositories directly. 

Besides, given the free nature of R, some of us install it in several
computers, even, in my case, briefly in somebody else's computer for a
short time if I have an urgent task to solve. Of course, I would never
do (or be able to do) this with SAS...

So, the number of downloads from CRAN servers seems like a lousy proxy
for the total number of users of SAS.

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-08 Thread ohri2007
Yes I think R as a package can really learn from SAS and SPSS in
making GUI more user friendly , even at the risk of dumbing down some
complexity..

also as a consultant I know that selling software requires a lot of
marketing follow ups..which is why R has lagged behind in actual
implementation and marketing  ( who will go on site at  a client and
implement)...despite being more robust and of course helping companies
save costs in these critical times.

If you market R more and even get a 10 % share of the commercial
market, imagine how many jobs you save by cutting down software costs
of the employers..

Ajay
www.decisionstats.com

On 1/8/09, Louis Bajuk-Yorgan lba...@tibco.com wrote:

 As the product manager for S+, I'd like to comment as well. I think the
 burgeoning interest in R demonstrates that there's demand for analytics
 to solve real, business-critical problems in a broad spectrum of
 companies and roles, and that some of the incumbent analytics offerings,
 in particular SAS and SPSS, don't sufficiently meet the growing need for
 analytics in many major companies.

 S+ (now TIBCO Spotfire S+) is of course a commercial software package
 based on the S language, which was a forerunner of R as mentioned in the
 article, and has been widely adopted. It is currently used in a wide
 variety of areas, including Life Sciences, Financial Services, and
 Utilities, for applications such as speeding the analysis of clinical
 trial data, optimizing portfolios, and assessing potential sites for
 building wind farms.

 I welcome, respect, and appreciate the vitality, creativity, and sheer
 productivity of the R community, and the high quality of statistical
 methods the community creates. And, because of the close historical ties
 between the two products, it is generally easy to port most R statistics
 into the commercial S+ environment, and we have worked to make that
 easier in recent releases.

 Once in S+, these analytic methods can be incorporated into intuitive
 tools for business decision makers and deployed to automated
 environments, using visual workflows, web-based applications (using
 standard web services), Spotfire Guided Applications for dynamic visual
 analysis, and scalable, event-driven architectures using TIBCO's IT
 infrastructure. S+ also provides some unique offerings, such as the
 ability to flexibly and efficiently analyze very large data sets.

 In this way, I feel companies can maximize the value of their analytic
 investments to make rapid business decisions, whether those analytics
 are developed in R or S+.

 Regards,
 Lou Bajuk-Yorgan
 Sr. Director, Product Management
 TIBCO Spotfire Division
 lba...@tibco.com

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of Douglas Bates
 Sent: Wednesday, January 07, 2009 12:58 PM
 To: marc_schwa...@comcast.net
 Cc: r-help@r-project.org
 Subject: Re: [R] R in the NY Times

 On Wed, Jan 7, 2009 at 8:50 AM, Marc Schwartz
 marc_schwa...@comcast.net wrote:
 on 01/07/2009 08:44 AM Kevin E. Thorpe wrote:
 Zaslavsky, Alan M. wrote:
 This article is accompanied by nice pictures of Robert and Ross.

 Data Analysts Captivated by Power of R
 http://www.nytimes.com/2009/01/07/technology/business-computing/07pr
 ogram.html



 January 7, 2009 Data Analysts Captivated by R's Power By ASHLEE
 VANCE


 SAS says it has noticed R's rising popularity at universities,
 despite educational discounts on its own software, but it dismisses
 the technology as being of interest to a limited set of people
 working on very hard tasks.

 I think it addresses a niche market for high-end data analysts that

 want free, readily available code, said Anne H. Milley, director of

 technology product marketing at SAS. She adds, We have customers
 who build engines for aircraft. I am happy they are not using
 freeware when I get on a jet.


 Thanks for posting.  Does anyone else find the statement by SAS to be

 humourous yet arrogant and short-sighted?

 Kevin

 It is an ignorant comment by a marketing person who has been spoon fed

 her lines...it is also a comment being made from a very defensive and
 insecure posture.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Regards,

Ajay Ohri
http://tinyurl.com/liajayohri

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-08 Thread Rahul-A.Agarwal
I believe R as a package has everything people with little knowledge of
programming can handle  quite easily. Moreover even if someone has no
programming knowledge can learn R without much effort.
I also believe if people in corporate world start using R instead of
other complex software which are very expensive then in this job make we
can save many jobs and can also save people.



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of ohri2...@gmail.com
Sent: Friday, January 09, 2009 12:58 AM
To: Louis Bajuk-Yorgan
Cc: r-help@r-project.org
Subject: Re: [R] R in the NY Times

Yes I think R as a package can really learn from SAS and SPSS in making
GUI more user friendly , even at the risk of dumbing down some
complexity..

also as a consultant I know that selling software requires a lot of
marketing follow ups..which is why R has lagged behind in actual
implementation and marketing  ( who will go on site at  a client and
implement)...despite being more robust and of course helping companies
save costs in these critical times.

If you market R more and even get a 10 % share of the commercial market,
imagine how many jobs you save by cutting down software costs of the
employers..

Ajay
www.decisionstats.com

On 1/8/09, Louis Bajuk-Yorgan lba...@tibco.com wrote:

 As the product manager for S+, I'd like to comment as well. I think 
 the burgeoning interest in R demonstrates that there's demand for 
 analytics to solve real, business-critical problems in a broad 
 spectrum of companies and roles, and that some of the incumbent 
 analytics offerings, in particular SAS and SPSS, don't sufficiently 
 meet the growing need for analytics in many major companies.

 S+ (now TIBCO Spotfire S+) is of course a commercial software package
 based on the S language, which was a forerunner of R as mentioned in 
 the article, and has been widely adopted. It is currently used in a 
 wide variety of areas, including Life Sciences, Financial Services, 
 and Utilities, for applications such as speeding the analysis of 
 clinical trial data, optimizing portfolios, and assessing potential 
 sites for building wind farms.

 I welcome, respect, and appreciate the vitality, creativity, and sheer

 productivity of the R community, and the high quality of statistical 
 methods the community creates. And, because of the close historical 
 ties between the two products, it is generally easy to port most R 
 statistics into the commercial S+ environment, and we have worked to 
 make that easier in recent releases.

 Once in S+, these analytic methods can be incorporated into intuitive 
 tools for business decision makers and deployed to automated 
 environments, using visual workflows, web-based applications (using 
 standard web services), Spotfire Guided Applications for dynamic 
 visual analysis, and scalable, event-driven architectures using 
 TIBCO's IT infrastructure. S+ also provides some unique offerings, 
 such as the ability to flexibly and efficiently analyze very large
data sets.

 In this way, I feel companies can maximize the value of their analytic

 investments to make rapid business decisions, whether those analytics 
 are developed in R or S+.

 Regards,
 Lou Bajuk-Yorgan
 Sr. Director, Product Management
 TIBCO Spotfire Division
 lba...@tibco.com

 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org]
 On Behalf Of Douglas Bates
 Sent: Wednesday, January 07, 2009 12:58 PM
 To: marc_schwa...@comcast.net
 Cc: r-help@r-project.org
 Subject: Re: [R] R in the NY Times

 On Wed, Jan 7, 2009 at 8:50 AM, Marc Schwartz 
 marc_schwa...@comcast.net wrote:
 on 01/07/2009 08:44 AM Kevin E. Thorpe wrote:
 Zaslavsky, Alan M. wrote:
 This article is accompanied by nice pictures of Robert and Ross.

 Data Analysts Captivated by Power of R 
 http://www.nytimes.com/2009/01/07/technology/business-computing/07p
 r
 ogram.html



 January 7, 2009 Data Analysts Captivated by R's Power By ASHLEE 
 VANCE


 SAS says it has noticed R's rising popularity at universities, 
 despite educational discounts on its own software, but it dismisses

 the technology as being of interest to a limited set of people 
 working on very hard tasks.

 I think it addresses a niche market for high-end data analysts 
 that

 want free, readily available code, said Anne H. Milley, director 
 of

 technology product marketing at SAS. She adds, We have customers 
 who build engines for aircraft. I am happy they are not using 
 freeware when I get on a jet.


 Thanks for posting.  Does anyone else find the statement by SAS to 
 be

 humourous yet arrogant and short-sighted?

 Kevin

 It is an ignorant comment by a marketing person who has been spoon 
 fed

 her lines...it is also a comment being made from a very defensive and

 insecure posture.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman

Re: [R] R in the NY Times

2009-01-08 Thread Marc Schwartz
on 01/08/2009 01:12 PM Andrew Choens wrote:
 On Thu, 2009-01-08 at 10:42 -0600, Stas Kolenikov wrote:
 A really good measure for R will be the total # of the downloads of
 r-base for all platforms from all CRAN mirrors (and I would expect
 that # can be found from the servers' logs). Given that it is so easy
 to download everything nice and clean and up to date, I would doubt
 anybody will be distributing CD-ROMs with R install files among
 friends and colleagues. SAS (and Stata, and SPSS, and Minitab, and...)
 should have their (internal) number of licenses sold (and yes those
 come on the disks initially), but those are badly blurred by the
 network licenses, and are commercial secrets, anyway.
 
 The number of r-core downloads is definitely NOT representative of the
 number of people using R. If you use R on Windows or OS X, you will
 obviously download R from the mirrors. However, this methodology would
 effectively ignore many users of R on Linux. I use R on a regular basis
 and I have it installed on three separate systems, all running Ubuntu.
 In all of these cases, I am downloading and installing r-core from the
 Ubuntu Mirror in the USA, not from CRAN. 

I would also note that R has been available via the Fedora yum repos for
some time, which as with the Debian/Ubuntu repos, would be missed in
just counting CRAN downloads.

There are quite a few other Linux distributions that have a similar
infrastructure in place where R is available as an 'add-on' or where the
main distribution itself includes R.

Additionally, there are many folks who will build R from source code,
using the updated source tarballs via FTP or, as I do, by getting the
source code right from the R subversion repo. These too would not be
considered in a CRAN based count.

 Of course, the number of Linux users is miniscule compared to the number
 of Windows users, but I think it is safe to say the Linux users are, in
 general, a more tech-savvy group than Windows users and are more likely
 to be comfortable using R's interactive programming interface. I think
 it is also fair to say that MANY (though not all) Linux users would be
 uncomfortable installing SPSS or SAS or Stata onto their open-source
 system and would prefer to use R. Thus, Linux users probably account for
 a higher proportion of R's user-base than they do in the general
 computing population. . . . although I do not claim to actually know
 this proportion.
 
 Ehh. Comparing the popularity of computer software is incredibly tricky
 to do, especially when some of the software being compared in
 open-source.

Correct. Trying extrapolate the number of users from any of these
measures is quite complex, if doable at all.

Even using the posting frequencies as I did yesterday, needs to be taken
with a grain of salt in trying to attempt to get a sense of growth.

As Dirk noted, the many R-SIG-* e-mail lists have offloaded some level
of traffic from R-Help, which may account for the rate of growth in the
R-Help posts declining somewhat since 2004 as Gabor pointed out, even
though the absolute number of annual posts continues to increase.

Reading the posts on SAS-L since yesterday via Google RSS, where the NYT
article was also posted, some have noted that SAS itself offers online
support forums (http://support.sas.com/forums/index.jspa). From a quick
review, it looks like the SAS.com forums date back to perhaps early
2006, thus possibly accounting for some of the leveling of the posts on
SAS-L recently.

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-08 Thread Carlos J. Gil Bellosta
On Thu, 2009-01-08 at 13:52 -0600, Marc Schwartz wrote:
 Reading the posts on SAS-L since yesterday via Google RSS, where the
 NYT
 article was also posted, some have noted that SAS itself offers online
 support forums (http://support.sas.com/forums/index.jspa). From a
 quick
 review, it looks like the SAS.com forums date back to perhaps early
 2006, thus possibly accounting for some of the leveling of the posts
 on
 SAS-L recently.

Hello,

Not only that: the corporate intranet of SAS (sections of which are
sometime open for external consultants for certain products) also
contain forums with an uneven traffic flow. These will certainly absorb
part of the traffic that would otherwise hit lists like SAS-L.

In fact, in my five years experience working (also as) a SAS consultant,
I have never posted to SAS-L. However, I have posted (or had my requests
posted by other SAS employees) on these lists.

Having said that, I should also add that R represents a threat to SAS
(which does not stand for Statistical Analysis System for a long time
already) in a business segment that very doubtfully accounts for more
than 5-10% of their revenue. They have to sell about 1000 licenses of
SAS/BASE and SAS/STAT in order to match the annual revenues from a
single license for a single solution in a single top tier bank.

It is quite amusing, though, to browse SAS marketing internal
documentation --to which I had access some time ago-- on how to
compete against R. The SAS salesperson statement in the article seems
to have been extracted verbatim from them. 

Best regards,

Carlos J. Gil Bellosta
http://www.datanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R in the NY Times

2009-01-07 Thread Zaslavsky, Alan M.
This article is accompanied by nice pictures of Robert and Ross.

Data Analysts Captivated by Power of R
  http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html

January 7, 2009
Data Analysts Captivated by R’s Power
By ASHLEE VANCE

To some people R is just the 18th letter of the alphabet. To others, it’s the 
rating on racy movies, a measure of an attic’s insulation or what pirates in 
movies say.

R is also the name of a popular programming language used by a growing number 
of data analysts inside corporations and academia. It is becoming their lingua 
franca partly because data mining has entered a golden age, whether being used 
to set ad prices, find new drugs more quickly or fine-tune financial models. 
Companies as diverse as Google, Pfizer, Merck, Bank of America, the 
InterContinental Hotels Group and Shell use it.

But R has also quickly found a following because statisticians, engineers and 
scientists without computer programming skills find it easy to use.

“R is really important to the point that it’s hard to overvalue it,” said Daryl 
Pregibon, a research scientist at Google, which uses the software widely. “It 
allows statisticians to do very intricate and complicated analyses without 
knowing the blood and guts of computing systems.”

It is also free. R is an open-source program, and its popularity reflects a 
shift in the type of software used inside corporations. Open-source software is 
free for anyone to use and modify. I.B.M., Hewlett-Packard and Dell make 
billions of dollars a year selling servers that run the open-source Linux 
operating system, which competes with Windows from Microsoft. Most Web sites 
are displayed using an open-source application called Apache, and companies 
increasingly rely on the open-source MySQL database to store their critical 
information. Many people view the end results of all this technology via the 
Firefox Web browser, also open-source software.

R is similar to other programming languages, like C, Java and Perl, in that it 
helps people perform a wide variety of computing tasks by giving them access to 
various commands. For statisticians, however, R is particularly useful because 
it contains a number of built-in mechanisms for organizing data, running 
calculations on the information and creating graphical representations of data 
sets.

Some people familiar with R describe it as a supercharged version of 
Microsoft’s Excel spreadsheet software that can help illuminate data trends 
more clearly than is possible by entering information into rows and columns.

What makes R so useful — and helps explain its quick acceptance — is that 
statisticians, engineers and scientists can improve the software’s code or 
write variations for specific tasks. Packages written for R add advanced 
algorithms, colored and textured graphs and mining techniques to dig deeper 
into databases.

Close to 1,600 different packages reside on just one of the many Web sites 
devoted to R, and the number of packages has grown exponentially. One package, 
called BiodiversityR, offers a graphical interface aimed at making calculations 
of environmental trends easier.

Another package, called Emu, analyzes speech patterns, while GenABEL is used to 
study the human genome.

The financial services community has demonstrated a particular affinity for R; 
dozens of packages exist for derivatives analysis alone.

“The great beauty of R is that you can modify it to do all sorts of things,” 
said Hal Varian, chief economist at Google. “And you have a lot of prepackaged 
stuff that’s already available, so you’re standing on the shoulders of giants.”

R first appeared in 1996, when the statistics professors Ross Ihaka and Robert 
Gentleman of the University of Auckland in New Zealand released the code as a 
free software package.

According to them, the notion of devising something like R sprang up during a 
hallway conversation. They both wanted technology better suited for their 
statistics students, who needed to analyze data and produce graphical models of 
the information. Most comparable software had been designed by computer 
scientists and proved hard to use.

Lacking deep computer science training, the professors considered their coding 
efforts more of an academic game than anything else. Nonetheless, starting in 
about 1991, they worked on R full time. “We were pretty much inseparable for 
five or six years,” Mr. Gentleman said. “One person would do the typing and one 
person would do the thinking.”

Some statisticians who took an early look at the software considered it rough 
around the edges. But despite its shortcomings, R immediately gained a 
following with people who saw the possibilities in customizing the free 
software.

John M. Chambers, a former Bell Labs researcher who is now a consulting 
professor of statistics at Stanford University, was an early champion. At Bell 
Labs, Mr. Chambers had helped develop S, another statistics software 

Re: [R] R in the NY Times

2009-01-07 Thread Bill Pikounis
Pardon my exuberance, but this is simply awesome. What a treat to find
on the front web page of the NY Times this morning under Technology. I
think the article is very well written by the author, and I think it
captures top highlights of why the software and community are so
special.

Continued high gratitude to all of R-core and the R community for its
unique accomplishments. Every bit of praise is well-earned and
deserved.

I have continuously claimed to colleagues (primarily pharma industry)
for the past 8 years or so that R is the most exciting going on in the
area of statistics.

Thanks,
Bill



Bill Pikounis
Statistician



On Wed, Jan 7, 2009 at 08:10, Zaslavsky, Alan M.
zasla...@hcp.med.harvard.edu wrote:
 This article is accompanied by nice pictures of Robert and Ross.

 Data Analysts Captivated by Power of R
  
 http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html

 January 7, 2009
 Data Analysts Captivated by R's Power
 By ASHLEE VANCE


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-07 Thread Frank E Harrell Jr
This is great to see.  It's interesting that SAS Institute feels that 
non-peer-reviewed software with hidden implementations of analytic 
methods that cannot be reproduced by others should be trusted when 
building aircraft engines.


Frank


Zaslavsky, Alan M. wrote:

This article is accompanied by nice pictures of Robert and Ross.

Data Analysts Captivated by Power of R
  http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html

January 7, 2009
Data Analysts Captivated by R’s Power
By ASHLEE VANCE

To some people R is just the 18th letter of the alphabet. To others, it’s the 
rating on racy movies, a measure of an attic’s insulation or what pirates in 
movies say.

R is also the name of a popular programming language used by a growing number 
of data analysts inside corporations and academia. It is becoming their lingua 
franca partly because data mining has entered a golden age, whether being used 
to set ad prices, find new drugs more quickly or fine-tune financial models. 
Companies as diverse as Google, Pfizer, Merck, Bank of America, the 
InterContinental Hotels Group and Shell use it.

But R has also quickly found a following because statisticians, engineers and 
scientists without computer programming skills find it easy to use.

“R is really important to the point that it’s hard to overvalue it,” said Daryl 
Pregibon, a research scientist at Google, which uses the software widely. “It 
allows statisticians to do very intricate and complicated analyses without 
knowing the blood and guts of computing systems.”

It is also free. R is an open-source program, and its popularity reflects a 
shift in the type of software used inside corporations. Open-source software is 
free for anyone to use and modify. I.B.M., Hewlett-Packard and Dell make 
billions of dollars a year selling servers that run the open-source Linux 
operating system, which competes with Windows from Microsoft. Most Web sites 
are displayed using an open-source application called Apache, and companies 
increasingly rely on the open-source MySQL database to store their critical 
information. Many people view the end results of all this technology via the 
Firefox Web browser, also open-source software.

R is similar to other programming languages, like C, Java and Perl, in that it 
helps people perform a wide variety of computing tasks by giving them access to 
various commands. For statisticians, however, R is particularly useful because 
it contains a number of built-in mechanisms for organizing data, running 
calculations on the information and creating graphical representations of data 
sets.

Some people familiar with R describe it as a supercharged version of 
Microsoft’s Excel spreadsheet software that can help illuminate data trends 
more clearly than is possible by entering information into rows and columns.

What makes R so useful — and helps explain its quick acceptance — is that 
statisticians, engineers and scientists can improve the software’s code or 
write variations for specific tasks. Packages written for R add advanced 
algorithms, colored and textured graphs and mining techniques to dig deeper 
into databases.

Close to 1,600 different packages reside on just one of the many Web sites 
devoted to R, and the number of packages has grown exponentially. One package, 
called BiodiversityR, offers a graphical interface aimed at making calculations 
of environmental trends easier.

Another package, called Emu, analyzes speech patterns, while GenABEL is used to 
study the human genome.

The financial services community has demonstrated a particular affinity for R; 
dozens of packages exist for derivatives analysis alone.

“The great beauty of R is that you can modify it to do all sorts of things,” 
said Hal Varian, chief economist at Google. “And you have a lot of prepackaged 
stuff that’s already available, so you’re standing on the shoulders of giants.”

R first appeared in 1996, when the statistics professors Ross Ihaka and Robert 
Gentleman of the University of Auckland in New Zealand released the code as a 
free software package.

According to them, the notion of devising something like R sprang up during a 
hallway conversation. They both wanted technology better suited for their 
statistics students, who needed to analyze data and produce graphical models of 
the information. Most comparable software had been designed by computer 
scientists and proved hard to use.

Lacking deep computer science training, the professors considered their coding 
efforts more of an academic game than anything else. Nonetheless, starting in 
about 1991, they worked on R full time. “We were pretty much inseparable for 
five or six years,” Mr. Gentleman said. “One person would do the typing and one 
person would do the thinking.”

Some statisticians who took an early look at the software considered it rough 
around the edges. But despite its shortcomings, R immediately gained a 
following with people who saw the 

Re: [R] R in the NY Times

2009-01-07 Thread Frank E Harrell Jr

Bill Pikounis wrote:

Pardon my exuberance, but this is simply awesome. What a treat to find
on the front web page of the NY Times this morning under Technology. I
think the article is very well written by the author, and I think it
captures top highlights of why the software and community are so
special.

Continued high gratitude to all of R-core and the R community for its
unique accomplishments. Every bit of praise is well-earned and
deserved.

I have continuously claimed to colleagues (primarily pharma industry)
for the past 8 years or so that R is the most exciting going on in the
area of statistics.

Thanks,
Bill


Amen to that, and in addition, R is now the top tool for everyday 
analysis, not just a research statistician's tool.


Frank





Bill Pikounis
Statistician



On Wed, Jan 7, 2009 at 08:10, Zaslavsky, Alan M.
zasla...@hcp.med.harvard.edu wrote:

This article is accompanied by nice pictures of Robert and Ross.

Data Analysts Captivated by Power of R
 http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html

January 7, 2009
Data Analysts Captivated by R's Power
By ASHLEE VANCE



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-07 Thread Simon Pickett
I would like to add that I would have spent many more years doing my PhD if 
it wasnt for R! all data management, statistics and graphics were conducted 
using it. This was the direction my university and many more research 
institutes appear to be heading.


It probably doesnt get said enough and I am sure I speak for all young 
researchers I am very much in debt for all the kind souls who have helped me 
and other newbies on this forum over the years,


Thanks very much R team.


- Original Message - 
From: Frank E Harrell Jr f.harr...@vanderbilt.edu

To: Bill Pikounis billpikou...@gmail.com
Cc: r-help@r-project.org
Sent: Wednesday, January 07, 2009 2:25 PM
Subject: Re: [R] R in the NY Times



Bill Pikounis wrote:

Pardon my exuberance, but this is simply awesome. What a treat to find
on the front web page of the NY Times this morning under Technology. I
think the article is very well written by the author, and I think it
captures top highlights of why the software and community are so
special.

Continued high gratitude to all of R-core and the R community for its
unique accomplishments. Every bit of praise is well-earned and
deserved.

I have continuously claimed to colleagues (primarily pharma industry)
for the past 8 years or so that R is the most exciting going on in the
area of statistics.

Thanks,
Bill


Amen to that, and in addition, R is now the top tool for everyday 
analysis, not just a research statistician's tool.


Frank





Bill Pikounis
Statistician



On Wed, Jan 7, 2009 at 08:10, Zaslavsky, Alan M.
zasla...@hcp.med.harvard.edu wrote:

This article is accompanied by nice pictures of Robert and Ross.

Data Analysts Captivated by Power of R

http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html

January 7, 2009
Data Analysts Captivated by R's Power
By ASHLEE VANCE



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-07 Thread Kevin E. Thorpe

Zaslavsky, Alan M. wrote:

This article is accompanied by nice pictures of Robert and Ross.

Data Analysts Captivated by Power of R 
http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html



January 7, 2009 Data Analysts Captivated by R’s Power By ASHLEE VANCE


SAS says it has noticed R’s rising popularity at universities,
despite educational discounts on its own software, but it dismisses
the technology as being of interest to a limited set of people
working on very hard tasks.

“I think it addresses a niche market for high-end data analysts that
want free, readily available code, said Anne H. Milley, director of
technology product marketing at SAS. She adds, “We have customers who
build engines for aircraft. I am happy they are not using freeware
when I get on a jet.”



Thanks for posting.  Does anyone else find the statement by SAS to be 
humourous yet arrogant and short-sighted?


Kevin

--
Kevin E. Thorpe
Biostatistician/Trialist, Knowledge Translation Program
Assistant Professor, Dalla Lana School of Public Health
University of Toronto
email: kevin.tho...@utoronto.ca  Tel: 416.864.5776  Fax: 416.864.6057

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R in the NY Times-IAsians perspective

2009-01-07 Thread Ajay ohri
R and its GUI Rattle helped me establish a data mining consulting startup on
my own, without taking bank credit .
People I met on the forum and especially books like
rforsasandspssusers.com/  http://rforsasandspssusers.com/

helped me ease the transition to the new Object Oriented method from the
earlier -

even a monkey can create shakespeare if he types enough kind of analytics
software.

.Since I am in India , the cost differences can cause almost a digital
divide in who can and who cant use sophisticated software.
Thanks to the Angels hereYes we Can R...


Regards,

Ajay

www.decisionstats.com



On Wed, Jan 7, 2009 at 8:03 PM, Simon Pickett simon.pick...@bto.org wrote:

 I would like to add that I would have spent many more years doing my PhD if
 it wasnt for R! all data management, statistics and graphics were conducted
 using it. This was the direction my university and many more research
 institutes appear to be heading.

 It probably doesnt get said enough and I am sure I speak for all young
 researchers I am very much in debt for all the kind souls who have helped me
 and other newbies on this forum over the years,

 Thanks very much R team.


 - Original Message - From: Frank E Harrell Jr 
 f.harr...@vanderbilt.edu
 To: Bill Pikounis billpikou...@gmail.com
 Cc: r-help@r-project.org
 Sent: Wednesday, January 07, 2009 2:25 PM
 Subject: Re: [R] R in the NY Times



  Bill Pikounis wrote:

 Pardon my exuberance, but this is simply awesome. What a treat to find
 on the front web page of the NY Times this morning under Technology. I
 think the article is very well written by the author, and I think it
 captures top highlights of why the software and community are so
 special.

 Continued high gratitude to all of R-core and the R community for its
 unique accomplishments. Every bit of praise is well-earned and
 deserved.

 I have continuously claimed to colleagues (primarily pharma industry)
 for the past 8 years or so that R is the most exciting going on in the
 area of statistics.

 Thanks,
 Bill


 Amen to that, and in addition, R is now the top tool for everyday
 analysis, not just a research statistician's tool.

 Frank


 

 Bill Pikounis
 Statistician



 On Wed, Jan 7, 2009 at 08:10, Zaslavsky, Alan M.
 zasla...@hcp.med.harvard.edu wrote:

 This article is accompanied by nice pictures of Robert and Ross.

 Data Analysts Captivated by Power of R


 http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html

 January 7, 2009
 Data Analysts Captivated by R's Power
 By ASHLEE VANCE


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-07 Thread Marc Schwartz
on 01/07/2009 08:44 AM Kevin E. Thorpe wrote:
 Zaslavsky, Alan M. wrote:
 This article is accompanied by nice pictures of Robert and Ross.

 Data Analysts Captivated by Power of R
 http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html



 January 7, 2009 Data Analysts Captivated by R’s Power By ASHLEE VANCE


 SAS says it has noticed R’s rising popularity at universities,
 despite educational discounts on its own software, but it dismisses
 the technology as being of interest to a limited set of people
 working on very hard tasks.

 “I think it addresses a niche market for high-end data analysts that
 want free, readily available code, said Anne H. Milley, director of
 technology product marketing at SAS. She adds, “We have customers who
 build engines for aircraft. I am happy they are not using freeware
 when I get on a jet.”

 
 Thanks for posting.  Does anyone else find the statement by SAS to be
 humourous yet arrogant and short-sighted?
 
 Kevin

It is an ignorant comment by a marketing person who has been spoon fed
her lines...it is also a comment being made from a very defensive and
insecure posture.

Congrats to R Core and the R Community. This is yet another sign of R's
growth and maturity.

Regards,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-07 Thread Tony Breyal
Thank you for posting this, I found it a very enjoyable read!

I am curious, is there an archive of 'R in the Media' or 'R in the
Press' articles somewhere? It would be interesting to see how the
perception of R has changed/evolved over time relative to other
packages.

Cheers,
Tony Breyal


On 7 Jan, 13:10, Zaslavsky, Alan M. zasla...@hcp.med.harvard.edu
wrote:
 This article is accompanied by nice pictures of Robert and Ross.

 Data Analysts Captivated by Power of R
  http://www.nytimes.com/2009/01/07/technology/business-computing/07pro...

 January 7, 2009
 Data Analysts Captivated by R’s Power
 By ASHLEE VANCE

 To some people R is just the 18th letter of the alphabet. To others, it’s the 
 rating on racy movies, a measure of an attic’s insulation or what pirates in 
 movies say.

 R is also the name of a popular programming language used by a growing number 
 of data analysts inside corporations and academia. It is becoming their 
 lingua franca partly because data mining has entered a golden age, whether 
 being used to set ad prices, find new drugs more quickly or fine-tune 
 financial models. Companies as diverse as Google, Pfizer, Merck, Bank of 
 America, the InterContinental Hotels Group and Shell use it.

 But R has also quickly found a following because statisticians, engineers and 
 scientists without computer programming skills find it easy to use.

 “R is really important to the point that it’s hard to overvalue it,” said 
 Daryl Pregibon, a research scientist at Google, which uses the software 
 widely. “It allows statisticians to do very intricate and complicated 
 analyses without knowing the blood and guts of computing systems.”

 It is also free. R is an open-source program, and its popularity reflects a 
 shift in the type of software used inside corporations. Open-source software 
 is free for anyone to use and modify. I.B.M., Hewlett-Packard and Dell make 
 billions of dollars a year selling servers that run the open-source Linux 
 operating system, which competes with Windows from Microsoft. Most Web sites 
 are displayed using an open-source application called Apache, and companies 
 increasingly rely on the open-source MySQL database to store their critical 
 information. Many people view the end results of all this technology via the 
 Firefox Web browser, also open-source software.

 R is similar to other programming languages, like C, Java and Perl, in that 
 it helps people perform a wide variety of computing tasks by giving them 
 access to various commands. For statisticians, however, R is particularly 
 useful because it contains a number of built-in mechanisms for organizing 
 data, running calculations on the information and creating graphical 
 representations of data sets.

 Some people familiar with R describe it as a supercharged version of 
 Microsoft’s Excel spreadsheet software that can help illuminate data trends 
 more clearly than is possible by entering information into rows and columns.

 What makes R so useful — and helps explain its quick acceptance — is that 
 statisticians, engineers and scientists can improve the software’s code or 
 write variations for specific tasks. Packages written for R add advanced 
 algorithms, colored and textured graphs and mining techniques to dig deeper 
 into databases.

 Close to 1,600 different packages reside on just one of the many Web sites 
 devoted to R, and the number of packages has grown exponentially. One 
 package, called BiodiversityR, offers a graphical interface aimed at making 
 calculations of environmental trends easier.

 Another package, called Emu, analyzes speech patterns, while GenABEL is used 
 to study the human genome.

 The financial services community has demonstrated a particular affinity for 
 R; dozens of packages exist for derivatives analysis alone.

 “The great beauty of R is that you can modify it to do all sorts of things,” 
 said Hal Varian, chief economist at Google. “And you have a lot of 
 prepackaged stuff that’s already available, so you’re standing on the 
 shoulders of giants.”

 R first appeared in 1996, when the statistics professors Ross Ihaka and 
 Robert Gentleman of the University of Auckland in New Zealand released the 
 code as a free software package.

 According to them, the notion of devising something like R sprang up during a 
 hallway conversation. They both wanted technology better suited for their 
 statistics students, who needed to analyze data and produce graphical models 
 of the information. Most comparable software had been designed by computer 
 scientists and proved hard to use.

 Lacking deep computer science training, the professors considered their 
 coding efforts more of an academic game than anything else. Nonetheless, 
 starting in about 1991, they worked on R full time. “We were pretty much 
 inseparable for five or six years,” Mr. Gentleman said. “One person would do 
 the typing and one person would do the thinking.”

 Some statisticians who took 

Re: [R] R in the NY Times

2009-01-07 Thread Rubén Roa-Ureta

Zaslavsky, Alan M. wrote:

This article is accompanied by nice pictures of Robert and Ross.

Data Analysts Captivated by Power of R
  http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html
  

Thanks for the heads up. The R morale is going through the roof!
I've given three courses on R since the second half of 2007 here in 
Chile (geostatistics, Fisheries Libraries for R, and generalized linear 
models) and all my three audiences (professionals working in academia, 
government, and private research institutions) were very much impressed 
by the power of R. I spent as much time on R itself as on the 
statistical topics, since students wanted to learn data management and 
graphics once they started to grasp the basic elements.
R creators, Core Team, package creators and maintainers, and experts on 
the list, thanks so much for such a great work and such an open 
attitude. You lead by example.

Rubén

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-07 Thread Jeffrey J. Hallman
The article quotes John Chambers, but it doesn't mention that R started out as
an implementation of the S language.  I don't suppose Insightful is too happy
about that.

The SAS spokesman quoted in the article is clearly whistling past the graveyard.
-- 
Jeff

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-07 Thread Darin A. England
On Wed, Jan 07, 2009 at 08:00:28AM -0600, Frank E Harrell Jr wrote:
 This is great to see.  It's interesting that SAS Institute feels that 
 non-peer-reviewed software with hidden implementations of analytic 
 methods that cannot be reproduced by others should be trusted when 
 building aircraft engines.
 
 Frank

Unfortunately, that type of FUD issued by the SAS marketing person still
works. I see it at my employer (a large healthcare company.) It's a
battle to change a culture, but ironically the recession helps.
People are now taking notice of the obscene licensing fees for SAS.

Darin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-07 Thread Duncan Murdoch

On 1/7/2009 9:44 AM, Kevin E. Thorpe wrote:

Zaslavsky, Alan M. wrote:

This article is accompanied by nice pictures of Robert and Ross.

Data Analysts Captivated by Power of R 
http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html



January 7, 2009 Data Analysts Captivated by R’s Power By ASHLEE VANCE


SAS says it has noticed R’s rising popularity at universities,
despite educational discounts on its own software, but it dismisses
the technology as being of interest to a limited set of people
working on very hard tasks.

“I think it addresses a niche market for high-end data analysts that
want free, readily available code, said Anne H. Milley, director of
technology product marketing at SAS. She adds, “We have customers who
build engines for aircraft. I am happy they are not using freeware
when I get on a jet.”



Thanks for posting.  Does anyone else find the statement by SAS to be 
humourous yet arrogant and short-sighted?


To me it just seemed like a blast from the past.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-07 Thread Peter Dalgaard
Jeffrey J. Hallman wrote:
 The article quotes John Chambers, but it doesn't mention that R started out as
 an implementation of the S language.  I don't suppose Insightful is too happy
 about that.

You mean Tibco...

The statement that S failed to generate broad interest is also a bit
misleading. I believe S-PLUS had more than 10 users in its day,
although it may be true that its success was mainly in the academic
world. Obviously the pool of people who knew S from the preceding decade
was very important for the early development of R.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-07 Thread Max Kuhn
 You can look on the SAS message boards and see there is a proportional 
 downturn in traffic.

I think that I actually made this statement about both the SAS and
Splus traffic...

I wasn't really trying to be critical of SAS. I was trying to get
across that SAS focused their resources on features that had nothing
to do with *statistical analysis* (e.g. data warehousing etc.)

-- 

Max

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-07 Thread David M Smith
On Wed, Jan 7, 2009 at 6:39 AM, Tony Breyal tony.bre...@googlemail.com wrote:
 Thank you for posting this, I found it a very enjoyable read!

 I am curious, is there an archive of 'R in the Media' or 'R in the
 Press' articles somewhere? It would be interesting to see how the
 perception of R has changed/evolved over time relative to other
 packages.

That's a great idea, and I just created an Rmedia category on the
REvolutions R blog to track exactly such articles.  You can find it
here:

http://blog.revolution-computing.com/rmedia/

If anyone knows of any other mainstream articles about R available
online please let me know, and I'll do a round-up post in that section
to make sure they're captured.

By the way, we're writing about R and issues related to R daily at:

http://blog.revolution-computing.com

# David Smith

-- 
David M Smith da...@revolution-computing.com
Director of Community, REvolution Computing www.revolution-computing.com
Tel: +1 (206) 577-4778 x3203 (Seattle, USA)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-07 Thread Bryan Hanson
I believe the SAS person shot themselves in the foot more in more ways than
one.  In my mind, the reason you would pay, as Frank said, for
 
 non-peer-reviewed software with hidden implementations of analytic
 methods that cannot be reproduced by others

Would be so that you can sue them later when a software problem in the
designing of the engine makes your plane fall out of the sky!

Bryan
*
Bryan Hanson
Professor of Chemistry  Biochemistry
DePauw University, Greencastle IN USA


 ³I think it addresses a niche market for high-end data analysts that
 want free, readily available code, said Anne H. Milley, director of
 technology product marketing at SAS. She adds, ³We have customers who
 build engines for aircraft. I am happy they are not using freeware
 when I get on a jet.²
 
 
 Thanks for posting.  Does anyone else find the statement by SAS to be
 humourous yet arrogant and short-sighted?
 
 Kevin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-07 Thread Marc Schwartz
I would also point out that the use of the term freeware as opposed to
FOSS by the SAS rep, comes off as being unprofessional and
deliberately condescending...

The author of the article, to his credit, was pretty consistent in using
open source terminology.

Regards,

Marc

on 01/07/2009 10:26 AM Bryan Hanson wrote:
 I believe the SAS person shot themselves in the foot more in more ways than
 one.  In my mind, the reason you would pay, as Frank said, for
  
 non-peer-reviewed software with hidden implementations of analytic
 methods that cannot be reproduced by others
 
 Would be so that you can sue them later when a software problem in the
 designing of the engine makes your plane fall out of the sky!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-07 Thread Andrew Choens

 Unfortunately, that type of FUD issued by the SAS marketing person still
 works. I see it at my employer (a large healthcare company.) It's a
 battle to change a culture, but ironically the recession helps.
 People are now taking notice of the obscene licensing fees for SAS.
 
 Darin

I agree. I work for a consulting firm (human services) and my boss
prefers us to use SPSS, rather than R. It's painful. I have version 11
installed on my Windows laptop. Next year, the license expires! 

For someone coming from a SPSS background, R is a little mind-blowing,
simply because it is so much more powerful. But, perseverance pays off.
Once I master Sweave and such, I'll be able to churn out reports much
more quickly than I ever could with SPSS.

I do wish the author of the article had included comments from SPSS, in
addition to the humorous FUD from the SAS spokesperson. Newer versions
of SPSS actually have the option of using R for data analysis, in
addition to the SPSS engine. It would have been interesting to compare
the corporate responses of the two companies.

-- 
Insert something humorous here.  :-)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-07 Thread Ajay ohri
you can use google alerts to track media coverage of R using some keywords

regards,

ajay



On Wed, Jan 7, 2009 at 9:52 PM, David M Smith 
da...@revolution-computing.com wrote:

 On Wed, Jan 7, 2009 at 6:39 AM, Tony Breyal tony.bre...@googlemail.com
 wrote:
  Thank you for posting this, I found it a very enjoyable read!
 
  I am curious, is there an archive of 'R in the Media' or 'R in the
  Press' articles somewhere? It would be interesting to see how the
  perception of R has changed/evolved over time relative to other
  packages.

 That's a great idea, and I just created an Rmedia category on the
 REvolutions R blog to track exactly such articles.  You can find it
 here:

 http://blog.revolution-computing.com/rmedia/

 If anyone knows of any other mainstream articles about R available
 online please let me know, and I'll do a round-up post in that section
 to make sure they're captured.

 By the way, we're writing about R and issues related to R daily at:

 http://blog.revolution-computing.com

 # David Smith

 --
 David M Smith da...@revolution-computing.com
 Director of Community, REvolution Computing www.revolution-computing.com
 Tel: +1 (206) 577-4778 x3203 (Seattle, USA)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-07 Thread Ted Harding
On 07-Jan-09 18:03:19, Erik Iverson wrote:
 I pointed a friend of mine toward the article, to which he replied: 
 
 I hope that they run SAS on Solaris too, god only knows how tainted
 the syscalls are in that linux freeware.
 
 Of course, now Solaris is 'freeware', too, so I suppose that according
 to SAS, running SAS on Windows is the best way to be sure you're
 getting the right answers.

I'm not so sure about that. Since the article described R as
a supercharged version of Microsoft's Excel, surely people
should run R on Windows and be *ab*so*lute*ly* sure of getting
the right answers (and supercharged to boot)
Ted.

 
 On Wed, 07 Jan 2009 10:56:53 -0600, Marc Schwartz
 marc_schwa...@comcast.net wrote:
 I would also point out that the use of the term freeware as opposed
 to
 FOSS by the SAS rep, comes off as being unprofessional and
 deliberately condescending...
 
 The author of the article, to his credit, was pretty consistent in
 using
 open source terminology.
 
 Regards,
 
 Marc
 
 on 01/07/2009 10:26 AM Bryan Hanson wrote:
 I believe the SAS person shot themselves in the foot more in more
 ways
 than
 one.  In my mind, the reason you would pay, as Frank said, for

 non-peer-reviewed software with hidden implementations of analytic
 methods that cannot be reproduced by others

 Would be so that you can sue them later when a software problem in
 the
 designing of the engine makes your plane fall out of the sky!
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk
Fax-to-email: +44 (0)870 094 0861
Date: 07-Jan-09   Time: 18:30:39
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-07 Thread Barry Rowlingson
2009/1/7 Darin A. England engl...@cs.umn.edu:

 Unfortunately, that type of FUD issued by the SAS marketing person still
 works. I see it at my employer (a large healthcare company.)

 I see it here, at a university. Quote: We couldn't possibly do our
analysis using some software we've just downloaded from a web site
*facepalm*

 It's a
 battle to change a culture, but ironically the recession helps.
 People are now taking notice of the obscene licensing fees for SAS.

 They'll just keep increasing their educational discount, or as we
say, the first hit is free...

BaRRy

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-07 Thread Tony Breyal
Google Alerts are great, but unfortuantly the brevity of R's name is
the main problem i think.

though, thinking about it, i suppose if one could work out the 'best'
key words to use, it might be possible to not get too many miss-
classified results, e.g.,

http://news.google.com/news?hl=enned=usnolr=1q=r+open+source+programming+languagebtnG=Search

or something like that. Will be keeping an eye on David's page from
time to time though, just in case he catches anything :-)

lovely to see R getting the attention it so rightly deserves.




On 7 Jan, 18:29, Ajay ohri ohri2...@gmail.com wrote:
 you can use google alerts to track media coverage of R using some keywords

 regards,

 ajay

 On Wed, Jan 7, 2009 at 9:52 PM, David M Smith 



 da...@revolution-computing.com wrote:
  On Wed, Jan 7, 2009 at 6:39 AM, Tony Breyal tony.bre...@googlemail.com
  wrote:
   Thank you for posting this, I found it a very enjoyable read!

   I am curious, is there an archive of 'R in the Media' or 'R in the
   Press' articles somewhere? It would be interesting to see how the
   perception of R has changed/evolved over time relative to other
   packages.

  That's a great idea, and I just created an Rmedia category on the
  REvolutions R blog to track exactly such articles.  You can find it
  here:

 http://blog.revolution-computing.com/rmedia/

  If anyone knows of any other mainstream articles about R available
  online please let me know, and I'll do a round-up post in that section
  to make sure they're captured.

  By the way, we're writing about R and issues related to R daily at:

 http://blog.revolution-computing.com

  # David Smith

  --
  David M Smith da...@revolution-computing.com
  Director of Community, REvolution Computingwww.revolution-computing.com
  Tel: +1 (206) 577-4778 x3203 (Seattle, USA)

  __
  r-h...@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

         [[alternative HTML version deleted]]

 __
 r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-07 Thread Wacek Kusnierczyk
Kevin E. Thorpe wrote:
 Zaslavsky, Alan M. wrote:
 SAS says it has noticed R’s rising popularity at universities,
 despite educational discounts on its own software, but it dismisses
 the technology as being of interest to a limited set of people
 working on very hard tasks.

 “I think it addresses a niche market for high-end data analysts that
 want free, readily available code, said Anne H. Milley, director of
 technology product marketing at SAS. She adds, “We have customers who
 build engines for aircraft. I am happy they are not using freeware
 when I get on a jet.”


 Thanks for posting. Does anyone else find the statement by SAS to be
 humourous yet arrogant and short-sighted?

there must be something wrong with me, but i can't find anything
'humorous yet arrogant and short-sighted' in the idea that engines for
aircraft be built with software that does not advertise itself with
'ABSOLUTELY NO WARRANTY.'


vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-07 Thread Marc Schwartz
on 01/07/2009 09:29 AM Max Kuhn wrote:
 You can look on the SAS message boards and see there is a proportional 
 downturn in traffic.
 
 I think that I actually made this statement about both the SAS and
 Splus traffic...
 
 I wasn't really trying to be critical of SAS. I was trying to get
 across that SAS focused their resources on features that had nothing
 to do with *statistical analysis* (e.g. data warehousing etc.)


Presuming that the Google Groups archive of SAS-L is reasonably complete:

 http://groups.google.com/group/comp.soft-sys.sas/about

The monthly posting frequency data since 1993 is:

Posts - structure(list(Jan = c(NA, 546L, 548L, 853L, 1007L, 894L, 514L,
1720L, 1826L, 1941L, 1832L, 1636L, 2122L, 2722L, 2750L, 2305L,
357L), Feb = c(NA, 511L, 734L, 1024L, 1150L, 1068L, 493L, 1519L,
1537L, 1845L, 1846L, 1652L, 1960L, 1645L, 926L, 2255L, NA), Mar = c(NA,
658L, 963L, 805L, 1108L, 945L, 659L, 1177L, 1915L, 2010L, 1755L,
2188L, 629L, 1711L, 1728L, 2712L, NA), Apr = c(NA, 681L, 792L,
1052L, 1315L, 784L, 1077L, 1163L, 1467L, 2199L, 1757L, 1826L,
2169L, 2796L, 2766L, 2789L, NA), May = c(NA, 712L, 945L, 1163L,
1212L, 448L, 778L, 1963L, 1735L, 2373L, 1863L, 1836L, 2283L,
3147L, 2974L, 2025L, NA), Jun = c(NA, 751L, 1002L, 999L, 1127L,
813L, 540L, 1615L, 1905L, 2133L, 1701L, 2606L, 2407L, 2723L,
2691L, 2368L, NA), Jul = c(15L, 763L, 775L, 1184L, 1074L, 896L,
476L, 1572L, 2027L, 2445L, 1926L, 1843L, 2061L, 761L, 2435L,
2607L, NA), Aug = c(458L, 975L, 969L, 1053L, 692L, 823L, 612L,
1696L, 1976L, 1492L, 1689L, 2143L, 1793L, 2027L, 2592L, 2584L,
NA), Sep = c(330L, 703L, 745L, 1176L, 947L, 894L, 1351L, 1491L,
1439L, 1864L, 1646L, 1784L, 1365L, 2714L, 1868L, 2554L, NA),
Oct = c(219L, 805L, 691L, 1197L, 900L, 1129L, 1708L, 1669L,
1592L, 2133L, 1832L, 1712L, 1427L, 2983L, 2320L, 2434L, NA
), Nov = c(472L, 752L, 773L, 911L, 853L, 733L, 1720L, 1490L,
1636L, 1663L, 1545L, 1786L, 1518L, 2848L, 2112L, 1984L, NA
), Dec = c(517L, 666L, 765L, 844L, 677L, 492L, 1595L, 1298L,
1424L, 1520L, 1445L, 2148L, 1524L, 2374L, 1948L, 1921L, NA
)), .Names = c(Jan, Feb, Mar, Apr, May, Jun,
Jul, Aug, Sep, Oct, Nov, Dec), class = data.frame,
row.names = c(1993,
1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001,
2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009
))



 Posts
  Jan  Feb  Mar  Apr  May  Jun  Jul  Aug  Sep  Oct  Nov  Dec
1993   NA   NA   NA   NA   NA   NA   15  458  330  219  472  517
1994  546  511  658  681  712  751  763  975  703  805  752  666
1995  548  734  963  792  945 1002  775  969  745  691  773  765
1996  853 1024  805 1052 1163  999 1184 1053 1176 1197  911  844
1997 1007 1150 1108 1315 1212 1127 1074  692  947  900  853  677
1998  894 1068  945  784  448  813  896  823  894 1129  733  492
1999  514  493  659 1077  778  540  476  612 1351 1708 1720 1595
2000 1720 1519 1177 1163 1963 1615 1572 1696 1491 1669 1490 1298
2001 1826 1537 1915 1467 1735 1905 2027 1976 1439 1592 1636 1424
2002 1941 1845 2010 2199 2373 2133 2445 1492 1864 2133 1663 1520
2003 1832 1846 1755 1757 1863 1701 1926 1689 1646 1832 1545 1445
2004 1636 1652 2188 1826 1836 2606 1843 2143 1784 1712 1786 2148
2005 2122 1960  629 2169 2283 2407 2061 1793 1365 1427 1518 1524
2006 2722 1645 1711 2796 3147 2723  761 2027 2714 2983 2848 2374
2007 2750  926 1728 2766 2974 2691 2435 2592 1868 2320 2112 1948
2008 2305 2255 2712 2789 2025 2368 2607 2584 2554 2434 1984 1921
2009  357   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA


One can then review the annual posting frequency via:

pdf(SAS-L.pdf, height = 4, width = 7)

mp - barplot(rowSums(Posts, na.rm = TRUE),
  beside = TRUE,
  cex.names = 0.6, main = SAS-L Traffic,
  cex.axis = 0.75, las = 1)

mtext(text = rowSums(Posts, na.rm = TRUE), at = mp, side = 1,
  line = 2, cex = 0.5)

dev.off()


There would appear to be marked increases in 2000 and again in 2006.
However, it has been flat for the past 3 calendar years. No decline yet,
but it will happen in due course...



No comparable posting data table exists for S-News as far as I can find,
so I wrote a quick program to read the S-News archive pages here:

  http://www.biostat.wustl.edu/archives/html/s-news/

and get monthly posting counts, using the 'Thread' based html pages,
where each monthly embedded post link has a URL of the form:

http://www.biostat.wustl.edu/archives/html/s-news/-MM/msgX.html


Thus, the program I used is:

TD - paste(rep(1998:2009, each = 12), sprintf(%02d, 1:12), sep = -)
Posts - numeric(length(TD))

for (i in seq(along = TD))
{
  URL - paste(http://www.biostat.wustl.edu/archives/html/s-news/;,
   TD[i], /threads.html, sep = )

  cat(URL, \n)

  if (!inherits(try(con - readLines(URL)), try-error))
  {
Posts[i] - length(grep(msg.*\\.html, con))
rm(con)
  } else {
Posts[i] - NA
  }
}


Posts - matrix(Posts, ncol = 12, byrow = TRUE)
rownames(Posts) - 1998:2009
colnames(Posts) - month.abb

That gives you:

Posts - 

Re: [R] R in the NY Times

2009-01-07 Thread Spencer Graves
What kind of warranty does SAS offer?  I haven't read their EULA 
recently, but if an airplane fell out of the sky because of a bug in SAS 
code, I'd be surprised if SAS was eager to pay damages!


Spencer

Wacek Kusnierczyk wrote:

Kevin E. Thorpe wrote:
  

Zaslavsky, Alan M. wrote:


SAS says it has noticed R’s rising popularity at universities,
despite educational discounts on its own software, but it dismisses
the technology as being of interest to a limited set of people
working on very hard tasks.

“I think it addresses a niche market for high-end data analysts that
want free, readily available code, said Anne H. Milley, director of
technology product marketing at SAS. She adds, “We have customers who
build engines for aircraft. I am happy they are not using freeware
when I get on a jet.”

  

Thanks for posting. Does anyone else find the statement by SAS to be
humourous yet arrogant and short-sighted?



there must be something wrong with me, but i can't find anything
'humorous yet arrogant and short-sighted' in the idea that engines for
aircraft be built with software that does not advertise itself with
'ABSOLUTELY NO WARRANTY.'


vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-07 Thread Duncan Murdoch

On 1/7/2009 3:03 PM, Wacek Kusnierczyk wrote:

Kevin E. Thorpe wrote:

Zaslavsky, Alan M. wrote:

SAS says it has noticed R’s rising popularity at universities,
despite educational discounts on its own software, but it dismisses
the technology as being of interest to a limited set of people
working on very hard tasks.

“I think it addresses a niche market for high-end data analysts that
want free, readily available code, said Anne H. Milley, director of
technology product marketing at SAS. She adds, “We have customers who
build engines for aircraft. I am happy they are not using freeware
when I get on a jet.”



Thanks for posting. Does anyone else find the statement by SAS to be
humourous yet arrogant and short-sighted?


there must be something wrong with me, but i can't find anything
'humorous yet arrogant and short-sighted' in the idea that engines for
aircraft be built with software that does not advertise itself with
'ABSOLUTELY NO WARRANTY.'


Yes, everyone knows that the lack of warranty should be hidden in the 
fine print, and say something like this:


Institute warrants that the media on which SAS/C OnlineDoc is furnished 
will be free from defects in material and workmanship under normal use 
for a period of ninety (90) days from the date of delivery of SAS/C 
OnlineDoc. Licensee’s exclusive remedy for breach of this warranty shall 
be replacement of the defective media by the Institute.  Institute and 
its licensors disclaim all other warranties, express or implied, 
including, but not limited to, any implied warranties of merchantability 
and/or fitness for a particular purpose whether alleged to arise by law, 
by reason of custom or usage in the trade, or by course of dealing. 


(Sorry, I couldn't find SAS/Stat's lack of warranty.  I found this one 
at 
http://support.sas.com/documentation/onlinedoc/sasc/doc700/html/common/agreement.htm)


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-07 Thread Mitchell Maltenfort
On Wed, Jan 7, 2009 at 3:19 PM, Spencer Graves spencer.gra...@pdf.com wrote:
 What kind of warranty does SAS offer?  I haven't read their EULA recently,
 but if an airplane fell out of the sky because of a bug in SAS code, I'd be
 surprised if SAS was eager to pay damages!

 Spencer




And that's an issue that always comes up on Linux v. Microsoft -- just
because you pay money for it doesn't mean you're buying meaningful
guarantees.
-- 
Due to the recession, requests for instant gratification will be
deferred until arrears in scheduled gratification have been satisfied.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-07 Thread Thomas Adams

Wacek,

One would hope that if someone were to use software to build engines 
for aircraft, that said person would sufficiently test the software to 
have confidence in it, whether it had a Warranty or not — at least 
that's my mode of operation…


Cheers!
Tom

Wacek Kusnierczyk wrote:

Kevin E. Thorpe wrote:
  

Zaslavsky, Alan M. wrote:


SAS says it has noticed R’s rising popularity at universities,
despite educational discounts on its own software, but it dismisses
the technology as being of interest to a limited set of people
working on very hard tasks.

“I think it addresses a niche market for high-end data analysts that
want free, readily available code, said Anne H. Milley, director of
technology product marketing at SAS. She adds, “We have customers who
build engines for aircraft. I am happy they are not using freeware
when I get on a jet.”

  

Thanks for posting. Does anyone else find the statement by SAS to be
humourous yet arrogant and short-sighted?



there must be something wrong with me, but i can't find anything
'humorous yet arrogant and short-sighted' in the idea that engines for
aircraft be built with software that does not advertise itself with
'ABSOLUTELY NO WARRANTY.'


vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
  



--
Thomas E Adams
National Weather Service
Ohio River Forecast Center
1901 South State Route 134
Wilmington, OH 45177

EMAIL:  thomas.ad...@noaa.gov

VOICE:  937-383-0528
FAX:937-383-0033

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-07 Thread Erik Iverson
I pointed a friend of mine toward the article, to which he replied: 

I hope that they run SAS on Solaris too, god only knows how tainted the
syscalls are in that linux freeware.

Of course, now Solaris is 'freeware', too, so I suppose that according to
SAS, running SAS on Windows is the best way to be sure you're getting the
right answers. 

On Wed, 07 Jan 2009 10:56:53 -0600, Marc Schwartz
marc_schwa...@comcast.net wrote:
 I would also point out that the use of the term freeware as opposed to
 FOSS by the SAS rep, comes off as being unprofessional and
 deliberately condescending...
 
 The author of the article, to his credit, was pretty consistent in using
 open source terminology.
 
 Regards,
 
 Marc
 
 on 01/07/2009 10:26 AM Bryan Hanson wrote:
 I believe the SAS person shot themselves in the foot more in more ways
 than
 one.  In my mind, the reason you would pay, as Frank said, for

 non-peer-reviewed software with hidden implementations of analytic
 methods that cannot be reproduced by others

 Would be so that you can sue them later when a software problem in the
 designing of the engine makes your plane fall out of the sky!
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-07 Thread Gabor Grothendieck
Here is the same number of messages/posts data
for each of S, SAS, R:
- reworked into a 3 column ts class time series
- with Jan 2009 removed since its not complete
- leading and trailing NA rows removed

At end we plot the raw data as well as the time
series of totals and show loess smooths for each.

By running the code below we see that the:
- sum of the three seems to be rising at a constant rate
- S is declining
- SAS and R are rising
- R is rising the fastest through its completed its phase
of highest growth which ended around 2004

tt3 - structure(c(15, 458, 330, 219, 472, 517, 546, 511, 658, 681,
712, 751, 763, 975, 703, 805, 752, 666, 548, 734, 963, 792, 945,
1002, 775, 969, 745, 691, 773, 765, 853, 1024, 805, 1052, 1163,
999, 1184, 1053, 1176, 1197, 911, 844, 1007, 1150, 1108, 1315,
1212, 1127, 1074, 692, 947, 900, 853, 677, 894, 1068, 945, 784,
448, 813, 896, 823, 894, 1129, 733, 492, 514, 493, 659, 1077,
778, 540, 476, 612, 1351, 1708, 1720, 1595, 1720, 1519, 1177,
1163, 1963, 1615, 1572, 1696, 1491, 1669, 1490, 1298, 1826, 1537,
1915, 1467, 1735, 1905, 2027, 1976, 1439, 1592, 1636, 1424, 1941,
1845, 2010, 2199, 2373, 2133, 2445, 1492, 1864, 2133, 1663, 1520,
1832, 1846, 1755, 1757, 1863, 1701, 1926, 1689, 1646, 1832, 1545,
1445, 1636, 1652, 2188, 1826, 1836, 2606, 1843, 2143, 1784, 1712,
1786, 2148, 2122, 1960, 629, 2169, 2283, 2407, 2061, 1793, 1365,
1427, 1518, 1524, 2722, 1645, 1711, 2796, 3147, 2723, 761, 2027,
2714, 2983, 2848, 2374, 2750, 926, 1728, 2766, 2974, 2691, 2435,
2592, 1868, 2320, 2112, 1948, 2305, 2255, 2712, 2789, 2025, 2368,
2607, 2584, 2554, 2434, 1984, 1921, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
273, 378, 293, 330, 243, 219, 209, 191, 241, 181, 141, 210, 173,
313, 300, 334, 254, 284, 270, 300, 253, 300, 194, 264, 313, 285,
264, 306, 247, 245, 302, 204, 251, 261, 176, 246, 232, 252, 300,
331, 282, 258, 260, 260, 229, 232, 194, 230, 255, 242, 228, 219,
248, 230, 207, 221, 280, 228, 177, 189, 179, 218, 196, 189, 217,
221, 187, 186, 295, 197, 142, 197, 230, 257, 151, 164, 175, 154,
187, 195, 150, 176, 176, 174, 161, 193, 182, 174, 109, 159, 144,
107, 98, 82, 84, 109, 87, 99, 123, 107, 96, 84, 97, 68, 73, 53,
20, 51, 59, 74, 48, 46, 34, 47, 39, 35, 70, 56, 41, 48, 63, 58,
47, 31, 27, 40, 28, 41, 30, 27, 36, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, 92, 36, 47, 41, 37, 40, 76, 61, 57, 135,
79, 114, 101, 90, 105, 110, 64, 94, 96, 184, 105, 226, 145, 195,
189, 161, 186, 184, 148, 203, 231, 318, 221, 205, 355, 377, 377,
504, 418, 293, 356, 434, 418, 433, 422, 558, 583, 651, 470, 552,
550, 615, 562, 678, 657, 825, 530, 884, 697, 880, 965, 1057,
926, 918, 824, 705, 1055, 1038, 742, 1017, 1137, 1203, 1488,
1268, 1319, 1344, 1210, 1443, 1567, 1605, 1158, 1116, 1580, 1946,
1657, 1561, 1714, 1618, 1493, 1534, 1712, 1895, 1481, 1746, 1724,
1703, 2057, 1887, 2056, 1872, 1777, 1709, 1810, 1907, 1508, 2075,
1920, 2270, 1818, 2029, 1811, 1785, 1898, 1902, 2328, 2127, 1450,
1714, 1907, 2191, 2145, 2210, 2307, 2138, 2241, 2028, 2708, 2594,
2028, 2490, 2583, 2740, 2487, 2517, 2774, 3268, 2813, 2990, 3037,
2730, 2399), .Dim = c(186L, 3L), .Dimnames = list(NULL, c(SAS,
S, R)), .Tsp = c(1993.5, 2008.917, 12), class = c(mts,
ts))

tt4 - cbind(tt3, rowSums(tt3))
colnames(tt4) - c(colnames(tt3), Sum)
ts.plot(tt4, col = 1:4)
grid()
legend(topleft, colnames(tt4), lty = 1, col = 1:4)

library(dyn)
for(i in 1:4) lines(fitted(dyn$loess(tt4[, i] ~ time(tt4))), col = i)


On Wed, Jan 7, 2009 at 3:07 PM, Marc Schwartz marc_schwa...@comcast.net wrote:
 on 01/07/2009 09:29 AM Max Kuhn wrote:
 You can look on the SAS message boards and see there is a proportional 
 downturn in traffic.

 I think that I actually made this statement about both the SAS and
 Splus traffic...

 I wasn't really trying to be critical of SAS. I was trying to get
 across that SAS focused their resources on features that had nothing
 to do with *statistical analysis* (e.g. data warehousing etc.)


 Presuming that the Google Groups archive of SAS-L is reasonably complete:

  http://groups.google.com/group/comp.soft-sys.sas/about

 The monthly posting frequency data since 1993 is:

 Posts - structure(list(Jan = c(NA, 546L, 548L, 853L, 1007L, 894L, 514L,
 1720L, 1826L, 1941L, 1832L, 1636L, 2122L, 2722L, 2750L, 2305L,
 357L), Feb = c(NA, 511L, 734L, 1024L, 1150L, 1068L, 493L, 1519L,
 1537L, 1845L, 1846L, 1652L, 1960L, 1645L, 926L, 2255L, NA), Mar = c(NA,
 658L, 963L, 805L, 1108L, 945L, 659L, 1177L, 1915L, 2010L, 1755L,
 2188L, 629L, 1711L, 1728L, 2712L, NA), Apr = c(NA, 681L, 792L,
 1052L, 1315L, 784L, 1077L, 1163L, 1467L, 2199L, 1757L, 1826L,
 2169L, 2796L, 2766L, 2789L, NA), May = c(NA, 712L, 945L, 1163L,
 1212L, 

Re: [R] R in the NY Times

2009-01-07 Thread hadley wickham
Here's a couple of similar plots created with ggplot2.  I chose to
turn the data into a data frame with an explicit date column.  Using a
log scale somewhat stabilises the variability.

## SAS-L traffic
sas - structure(list(Jan = c(NA, 546L, 548L, 853L, 1007L, 894L, 514L,
1720L, 1826L, 1941L, 1832L, 1636L, 2122L, 2722L, 2750L, 2305L,
357L), Feb = c(NA, 511L, 734L, 1024L, 1150L, 1068L, 493L, 1519L,
1537L, 1845L, 1846L, 1652L, 1960L, 1645L, 926L, 2255L, NA), Mar = c(NA,
658L, 963L, 805L, 1108L, 945L, 659L, 1177L, 1915L, 2010L, 1755L,
2188L, 629L, 1711L, 1728L, 2712L, NA), Apr = c(NA, 681L, 792L,
1052L, 1315L, 784L, 1077L, 1163L, 1467L, 2199L, 1757L, 1826L,
2169L, 2796L, 2766L, 2789L, NA), May = c(NA, 712L, 945L, 1163L,
1212L, 448L, 778L, 1963L, 1735L, 2373L, 1863L, 1836L, 2283L,
3147L, 2974L, 2025L, NA), Jun = c(NA, 751L, 1002L, 999L, 1127L,
813L, 540L, 1615L, 1905L, 2133L, 1701L, 2606L, 2407L, 2723L,
2691L, 2368L, NA), Jul = c(15L, 763L, 775L, 1184L, 1074L, 896L,
476L, 1572L, 2027L, 2445L, 1926L, 1843L, 2061L, 761L, 2435L,
2607L, NA), Aug = c(458L, 975L, 969L, 1053L, 692L, 823L, 612L,
1696L, 1976L, 1492L, 1689L, 2143L, 1793L, 2027L, 2592L, 2584L,
NA), Sep = c(330L, 703L, 745L, 1176L, 947L, 894L, 1351L, 1491L,
1439L, 1864L, 1646L, 1784L, 1365L, 2714L, 1868L, 2554L, NA),
Oct = c(219L, 805L, 691L, 1197L, 900L, 1129L, 1708L, 1669L,
1592L, 2133L, 1832L, 1712L, 1427L, 2983L, 2320L, 2434L, NA
), Nov = c(472L, 752L, 773L, 911L, 853L, 733L, 1720L, 1490L,
1636L, 1663L, 1545L, 1786L, 1518L, 2848L, 2112L, 1984L, NA
), Dec = c(517L, 666L, 765L, 844L, 677L, 492L, 1595L, 1298L,
1424L, 1520L, 1445L, 2148L, 1524L, 2374L, 1948L, 1921L, NA
)), .Names = c(Jan, Feb, Mar, Apr, May, Jun,
Jul, Aug, Sep, Oct, Nov, Dec), class = data.frame,
row.names = c(1993,
1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001,
2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009
))

## s-news traffic
s - structure(c(NA, 210, 264, 246, 230, 189, 197, 174, 109, 51, 48,
5, 273, 173, 313, 232, 255, 179, 230, 161, 87, 59, 63, NA, 378,
313, 285, 252, 242, 218, 257, 193, 99, 74, 58, NA, 293, 300,
264, 300, 228, 196, 151, 182, 123, 48, 47, NA, 330, 334, 306,
331, 219, 189, 164, 174, 107, 46, 31, NA, 243, 254, 247, 282,
248, 217, 175, 109, 96, 34, 27, NA, 219, 284, 245, 258, 230,
221, 154, 159, 84, 47, 40, NA, 209, 270, 302, 260, 207, 187,
187, 144, 97, 39, 28, NA, 191, 300, 204, 260, 221, 186, 195,
107, 68, 35, 41, NA, 241, 253, 251, 229, 280, 295, 150, 98, 73,
70, 30, NA, 181, 300, 261, 232, 228, 197, 176, 82, 53, 56, 27,
NA, 141, 194, 176, 194, 177, 142, 176, 84, 20, 41, 36, NA), .Dim = c(12L,
12L), .Dimnames = list(c(1998, 1999, 2000, 2001, 2002,
2003, 2004, 2005, 2006, 2007, 2008, 2009), c(Jan,
Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct,
Nov, Dec)))

r - structure(c(NA, 135, 226, 205, 558, 884, 1017, 1116, 1746,
2075, 1714, 2490, 462, NA, 79, 145, 355, 583, 697, 1137, 1580, 1724,
1920, 1907, 2583, NA, NA, 114, 195, 377, 651, 880, 1203, 1946,
1703, 2270, 2191, 2740, NA, 92, 101, 189, 377, 470, 965, 1488,
1657, 2057, 1818, 2145, 2487, NA, 36, 90, 161, 504, 552, 1057,
1268, 1561, 1887, 2029, 2210, 2517, NA, 47, 105, 186, 418, 550,
926, 1319, 1714, 2056, 1811, 2307, 2774, NA, 41, 110, 184, 293,
615, 918, 1344, 1618, 1872, 1785, 2138, 3268, NA, 37, 64, 148,
356, 562, 824, 1210, 1493, 1777, 1898, 2241, 2813, NA, 40, 94,
203, 434, 678, 705, 1443, 1534, 1709, 1902, 2028, 2990, NA, 76,
96, 231, 418, 657, 1055, 1567, 1712, 1810, 2328, 2708, 3037,
NA, 61, 184, 318, 433, 825, 1038, 1605, 1895, 1907, 2127, 2594,
2730, NA, 57, 105, 221, 422, 530, 742, 1158, 1481, 1508, 1450,
2028, 2399, NA), .Dim = c(13L, 12L), .Dimnames = list(c(1997,
1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
2006, 2007, 2008, 2009), c(Jan, Feb, Mar, Apr,
May, Jun, Jul, Aug, Sep, Oct, Nov, Dec)))

library(reshape)
sas - melt(as.matrix(sas), na.rm = TRUE)
r - melt(r, na.rm = TRUE)
s - melt(s, na.rm = TRUE)
names(r) - names(s) - names(sas) - c(year, month, count)

sas$software - sas
s$software - s
r$software - r
all - rbind(sas, s, r)
all$date - with(all,
  as.Date(paste(year, month, 15, sep = -), %Y-%b-%d))


library(ggplot2)
qplot(date, count, data = all, geom = line, colour = software) +
   geom_smooth(se = F, size = 1)
last_plot() + scale_y_log10(breaks = 10^(1:3), labels = 10^(1:3))

yearly - ddply(all, .(year, software), function(df) c(count = sum(df$count)))
qplot(year, count, data = yearly, geom = line, colour = software)


Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-07 Thread Gabor Grothendieck
I did try the log version as well prior to posting but although
it would seem to exaggerate the difference to me the insights
from plotting the raw data with loess (i.e. constancy of the total, piecewise
constant growth of R) come through best.

On Wed, Jan 7, 2009 at 6:53 PM, Spencer Graves
spencer.gra...@prodsyse.com wrote:
 Thanks, Gabor, Marc, Max:
 The image is even more striking (and more accurately reflects reality, I
 believe) if you add log='y' to ts.plot.
 Best Wishes,
 Spencer

 Gabor Grothendieck wrote:

 Here is the same number of messages/posts data
 for each of S, SAS, R:
 - reworked into a 3 column ts class time series
 - with Jan 2009 removed since its not complete
 - leading and trailing NA rows removed

 At end we plot the raw data as well as the time
 series of totals and show loess smooths for each.

 By running the code below we see that the:
 - sum of the three seems to be rising at a constant rate
 - S is declining
 - SAS and R are rising
 - R is rising the fastest through its completed its phase
 of highest growth which ended around 2004

 tt3 - structure(c(15, 458, 330, 219, 472, 517, 546, 511, 658, 681,
 712, 751, 763, 975, 703, 805, 752, 666, 548, 734, 963, 792, 945,
 1002, 775, 969, 745, 691, 773, 765, 853, 1024, 805, 1052, 1163,
 999, 1184, 1053, 1176, 1197, 911, 844, 1007, 1150, 1108, 1315,
 1212, 1127, 1074, 692, 947, 900, 853, 677, 894, 1068, 945, 784,
 448, 813, 896, 823, 894, 1129, 733, 492, 514, 493, 659, 1077,
 778, 540, 476, 612, 1351, 1708, 1720, 1595, 1720, 1519, 1177,
 1163, 1963, 1615, 1572, 1696, 1491, 1669, 1490, 1298, 1826, 1537,
 1915, 1467, 1735, 1905, 2027, 1976, 1439, 1592, 1636, 1424, 1941,
 1845, 2010, 2199, 2373, 2133, 2445, 1492, 1864, 2133, 1663, 1520,
 1832, 1846, 1755, 1757, 1863, 1701, 1926, 1689, 1646, 1832, 1545,
 1445, 1636, 1652, 2188, 1826, 1836, 2606, 1843, 2143, 1784, 1712,
 1786, 2148, 2122, 1960, 629, 2169, 2283, 2407, 2061, 1793, 1365,
 1427, 1518, 1524, 2722, 1645, 1711, 2796, 3147, 2723, 761, 2027,
 2714, 2983, 2848, 2374, 2750, 926, 1728, 2766, 2974, 2691, 2435,
 2592, 1868, 2320, 2112, 1948, 2305, 2255, 2712, 2789, 2025, 2368,
 2607, 2584, 2554, 2434, 1984, 1921, NA, NA, NA, NA, NA, NA, NA,
 NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
 NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
 NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
 273, 378, 293, 330, 243, 219, 209, 191, 241, 181, 141, 210, 173,
 313, 300, 334, 254, 284, 270, 300, 253, 300, 194, 264, 313, 285,
 264, 306, 247, 245, 302, 204, 251, 261, 176, 246, 232, 252, 300,
 331, 282, 258, 260, 260, 229, 232, 194, 230, 255, 242, 228, 219,
 248, 230, 207, 221, 280, 228, 177, 189, 179, 218, 196, 189, 217,
 221, 187, 186, 295, 197, 142, 197, 230, 257, 151, 164, 175, 154,
 187, 195, 150, 176, 176, 174, 161, 193, 182, 174, 109, 159, 144,
 107, 98, 82, 84, 109, 87, 99, 123, 107, 96, 84, 97, 68, 73, 53,
 20, 51, 59, 74, 48, 46, 34, 47, 39, 35, 70, 56, 41, 48, 63, 58,
 47, 31, 27, 40, 28, 41, 30, 27, 36, NA, NA, NA, NA, NA, NA, NA,
 NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
 NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
 NA, NA, NA, NA, NA, NA, 92, 36, 47, 41, 37, 40, 76, 61, 57, 135,
 79, 114, 101, 90, 105, 110, 64, 94, 96, 184, 105, 226, 145, 195,
 189, 161, 186, 184, 148, 203, 231, 318, 221, 205, 355, 377, 377,
 504, 418, 293, 356, 434, 418, 433, 422, 558, 583, 651, 470, 552,
 550, 615, 562, 678, 657, 825, 530, 884, 697, 880, 965, 1057,
 926, 918, 824, 705, 1055, 1038, 742, 1017, 1137, 1203, 1488,
 1268, 1319, 1344, 1210, 1443, 1567, 1605, 1158, 1116, 1580, 1946,
 1657, 1561, 1714, 1618, 1493, 1534, 1712, 1895, 1481, 1746, 1724,
 1703, 2057, 1887, 2056, 1872, 1777, 1709, 1810, 1907, 1508, 2075,
 1920, 2270, 1818, 2029, 1811, 1785, 1898, 1902, 2328, 2127, 1450,
 1714, 1907, 2191, 2145, 2210, 2307, 2138, 2241, 2028, 2708, 2594,
 2028, 2490, 2583, 2740, 2487, 2517, 2774, 3268, 2813, 2990, 3037,
 2730, 2399), .Dim = c(186L, 3L), .Dimnames = list(NULL, c(SAS,
 S, R)), .Tsp = c(1993.5, 2008.917, 12), class = c(mts,
 ts))

 tt4 - cbind(tt3, rowSums(tt3))
 colnames(tt4) - c(colnames(tt3), Sum)
 ts.plot(tt4, col = 1:4)
 grid()
 legend(topleft, colnames(tt4), lty = 1, col = 1:4)

 library(dyn)
 for(i in 1:4) lines(fitted(dyn$loess(tt4[, i] ~ time(tt4))), col = i)


 On Wed, Jan 7, 2009 at 3:07 PM, Marc Schwartz marc_schwa...@comcast.net
 wrote:


 on 01/07/2009 09:29 AM Max Kuhn wrote:


 You can look on the SAS message boards and see there is a proportional
 downturn in traffic.


 I think that I actually made this statement about both the SAS and
 Splus traffic...

 I wasn't really trying to be critical of SAS. I was trying to get
 across that SAS focused their resources on features that had nothing
 to do with *statistical analysis* (e.g. data warehousing etc.)


 Presuming that the Google Groups archive of SAS-L is reasonably complete:

  

Re: [R] R in the NY Times

2009-01-07 Thread Marc Schwartz
 Here's a couple of similar plots created with ggplot2.  I chose to
 turn the data into a data frame with an explicit date column.  Using a
 log scale somewhat stabilises the variability.
 
 ## SAS-L traffic
 sas - structure(list(Jan = c(NA, 546L, 548L, 853L, 1007L, 894L, 514L,
 1720L, 1826L, 1941L, 1832L, 1636L, 2122L, 2722L, 2750L, 2305L,
 357L), Feb = c(NA, 511L, 734L, 1024L, 1150L, 1068L, 493L, 1519L,
 1537L, 1845L, 1846L, 1652L, 1960L, 1645L, 926L, 2255L, NA), Mar = c(NA,
 658L, 963L, 805L, 1108L, 945L, 659L, 1177L, 1915L, 2010L, 1755L,
 2188L, 629L, 1711L, 1728L, 2712L, NA), Apr = c(NA, 681L, 792L,
 1052L, 1315L, 784L, 1077L, 1163L, 1467L, 2199L, 1757L, 1826L,
 2169L, 2796L, 2766L, 2789L, NA), May = c(NA, 712L, 945L, 1163L,
 1212L, 448L, 778L, 1963L, 1735L, 2373L, 1863L, 1836L, 2283L,
 3147L, 2974L, 2025L, NA), Jun = c(NA, 751L, 1002L, 999L, 1127L,
 813L, 540L, 1615L, 1905L, 2133L, 1701L, 2606L, 2407L, 2723L,
 2691L, 2368L, NA), Jul = c(15L, 763L, 775L, 1184L, 1074L, 896L,
 476L, 1572L, 2027L, 2445L, 1926L, 1843L, 2061L, 761L, 2435L,
 2607L, NA), Aug = c(458L, 975L, 969L, 1053L, 692L, 823L, 612L,
 1696L, 1976L, 1492L, 1689L, 2143L, 1793L, 2027L, 2592L, 2584L,
 NA), Sep = c(330L, 703L, 745L, 1176L, 947L, 894L, 1351L, 1491L,
 1439L, 1864L, 1646L, 1784L, 1365L, 2714L, 1868L, 2554L, NA),
 Oct = c(219L, 805L, 691L, 1197L, 900L, 1129L, 1708L, 1669L,
 1592L, 2133L, 1832L, 1712L, 1427L, 2983L, 2320L, 2434L, NA
 ), Nov = c(472L, 752L, 773L, 911L, 853L, 733L, 1720L, 1490L,
 1636L, 1663L, 1545L, 1786L, 1518L, 2848L, 2112L, 1984L, NA
 ), Dec = c(517L, 666L, 765L, 844L, 677L, 492L, 1595L, 1298L,
 1424L, 1520L, 1445L, 2148L, 1524L, 2374L, 1948L, 1921L, NA
 )), .Names = c(Jan, Feb, Mar, Apr, May, Jun,
 Jul, Aug, Sep, Oct, Nov, Dec), class = data.frame,
 row.names = c(1993,
 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001,
 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009
 ))
 
 ## s-news traffic
 s - structure(c(NA, 210, 264, 246, 230, 189, 197, 174, 109, 51, 48,
 5, 273, 173, 313, 232, 255, 179, 230, 161, 87, 59, 63, NA, 378,
 313, 285, 252, 242, 218, 257, 193, 99, 74, 58, NA, 293, 300,
 264, 300, 228, 196, 151, 182, 123, 48, 47, NA, 330, 334, 306,
 331, 219, 189, 164, 174, 107, 46, 31, NA, 243, 254, 247, 282,
 248, 217, 175, 109, 96, 34, 27, NA, 219, 284, 245, 258, 230,
 221, 154, 159, 84, 47, 40, NA, 209, 270, 302, 260, 207, 187,
 187, 144, 97, 39, 28, NA, 191, 300, 204, 260, 221, 186, 195,
 107, 68, 35, 41, NA, 241, 253, 251, 229, 280, 295, 150, 98, 73,
 70, 30, NA, 181, 300, 261, 232, 228, 197, 176, 82, 53, 56, 27,
 NA, 141, 194, 176, 194, 177, 142, 176, 84, 20, 41, 36, NA), .Dim = c(12L,
 12L), .Dimnames = list(c(1998, 1999, 2000, 2001, 2002,
 2003, 2004, 2005, 2006, 2007, 2008, 2009), c(Jan,
 Feb, Mar, Apr, May, Jun, Jul, Aug, Sep, Oct,
 Nov, Dec)))
 
 r - structure(c(NA, 135, 226, 205, 558, 884, 1017, 1116, 1746,
 2075, 1714, 2490, 462, NA, 79, 145, 355, 583, 697, 1137, 1580, 1724,
 1920, 1907, 2583, NA, NA, 114, 195, 377, 651, 880, 1203, 1946,
 1703, 2270, 2191, 2740, NA, 92, 101, 189, 377, 470, 965, 1488,
 1657, 2057, 1818, 2145, 2487, NA, 36, 90, 161, 504, 552, 1057,
 1268, 1561, 1887, 2029, 2210, 2517, NA, 47, 105, 186, 418, 550,
 926, 1319, 1714, 2056, 1811, 2307, 2774, NA, 41, 110, 184, 293,
 615, 918, 1344, 1618, 1872, 1785, 2138, 3268, NA, 37, 64, 148,
 356, 562, 824, 1210, 1493, 1777, 1898, 2241, 2813, NA, 40, 94,
 203, 434, 678, 705, 1443, 1534, 1709, 1902, 2028, 2990, NA, 76,
 96, 231, 418, 657, 1055, 1567, 1712, 1810, 2328, 2708, 3037,
 NA, 61, 184, 318, 433, 825, 1038, 1605, 1895, 1907, 2127, 2594,
 2730, NA, 57, 105, 221, 422, 530, 742, 1158, 1481, 1508, 1450,
 2028, 2399, NA), .Dim = c(13L, 12L), .Dimnames = list(c(1997,
 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
 2006, 2007, 2008, 2009), c(Jan, Feb, Mar, Apr,
 May, Jun, Jul, Aug, Sep, Oct, Nov, Dec)))
 
 library(reshape)
 sas - melt(as.matrix(sas), na.rm = TRUE)
 r - melt(r, na.rm = TRUE)
 s - melt(s, na.rm = TRUE)
 names(r) - names(s) - names(sas) - c(year, month, count)
 
 sas$software - sas
 s$software - s
 r$software - r
 all - rbind(sas, s, r)
 all$date - with(all,
   as.Date(paste(year, month, 15, sep = -), %Y-%b-%d))
 
 
 library(ggplot2)
 qplot(date, count, data = all, geom = line, colour = software) +
geom_smooth(se = F, size = 1)
 last_plot() + scale_y_log10(breaks = 10^(1:3), labels = 10^(1:3))
 
 yearly - ddply(all, .(year, software), function(df) c(count = sum(df$count)))
 qplot(year, count, data = yearly, geom = line, colour = software)


Hadley,

You might want to remove the 2009 data from each of the three lists
given that the January data is not yet complete.

The result of including the January 2009 data in your plots is that the
growth trajectory for the smoothed curves for SAS-L and R-Help appear to
be leveling or even declining, when at least for R-Help, that is not the
case. The S-News curve is not affected significantly, given the already
declining counts.

The effect of the 2009 data is most noticeable in the 

Re: [R] R in the NY Times

2009-01-07 Thread hadley wickham
 You might want to remove the 2009 data from each of the three lists
 given that the January data is not yet complete.

 The result of including the January 2009 data in your plots is that the
 growth trajectory for the smoothed curves for SAS-L and R-Help appear to
 be leveling or even declining, when at least for R-Help, that is not the
 case. The S-News curve is not affected significantly, given the already
 declining counts.

 The effect of the 2009 data is most noticeable in the log scale plot.

 Thus:

 all - subset(all, year  2009)

Good point - thanks for the fix!

Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-07 Thread Gabor Grothendieck
Note that the mts object I posted already had Jan 2009 removed and also
had the NA rows removed.

On Wed, Jan 7, 2009 at 9:58 PM, hadley wickham h.wick...@gmail.com wrote:
 You might want to remove the 2009 data from each of the three lists
 given that the January data is not yet complete.

 The result of including the January 2009 data in your plots is that the
 growth trajectory for the smoothed curves for SAS-L and R-Help appear to
 be leveling or even declining, when at least for R-Help, that is not the
 case. The S-News curve is not affected significantly, given the already
 declining counts.

 The effect of the 2009 data is most noticeable in the log scale plot.

 Thus:

 all - subset(all, year  2009)

 Good point - thanks for the fix!

 Hadley

 --
 http://had.co.nz/


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-07 Thread Dirk Eddelbuettel

On 7 January 2009 at 18:24, Gabor Grothendieck wrote:
| By running the code below we see that the:
| - sum of the three seems to be rising at a constant rate
| - S is declining
| - SAS and R are rising
| - R is rising the fastest through its completed its phase
| of highest growth which ended around 2004

I wonder whether we need to account for traffic on all the additional r-sig-*
mailing lists ?  

Of the handful that I follow, some seem to have taken traffic from r-help.
This could account for (at least parts of) the apparent traffic growth
slowdown since 2004 as many of these added lists appeared only in the last
few years.

Dirk

-- 
Three out of two people have difficulties with fractions.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R in the NY Times

2009-01-07 Thread Gabor Grothendieck
On Wed, Jan 7, 2009 at 10:26 PM, Dirk Eddelbuettel e...@debian.org wrote:

 On 7 January 2009 at 18:24, Gabor Grothendieck wrote:
 | By running the code below we see that the:
 | - sum of the three seems to be rising at a constant rate
 | - S is declining
 | - SAS and R are rising
 | - R is rising the fastest through its completed its phase
 | of highest growth which ended around 2004

 I wonder whether we need to account for traffic on all the additional r-sig-*
 mailing lists ?

 Of the handful that I follow, some seem to have taken traffic from r-help.
 This could account for (at least parts of) the apparent traffic growth
 slowdown since 2004 as many of these added lists appeared only in the last
 few years.


Good observation.  It would be interesting to combine the data from all
the lists to see what the effect is.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.