Re: [R] What made us so popular Nov 16-20?

2005-11-28 Thread Seeliger . Curt
Duncan asks:
 Did we get mentioned somewhere (e.g. Slashdot), or was someone just
 experimenting with some automated downloading?

R was mentioned in last week's (I think) O'Reilly newsletter, which
included a link to a short article showing how easy it is to get R to
graph stuff like stock price histories.  That's the publisher, not the
talking head.

For what it's worth, the article isn't worth chasing down.  It left a
beginner like me disappointed that R's capabilities weren't better
shown, and that he relied on Perl to do data manipulation.

cur

--
Curt Seeliger, Data Ranger
CSC, EPA/WED contractor
541/754-4638
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Assign references

2005-10-07 Thread Seeliger . Curt
Folks,

I've run into trouble while writing functions that I hope will create
and modify a dataframe or two.  To that end I've written a toy function
that simply sets a couple of variables (well, tries but fails).
Searching the archives, Thomas Lumley recently explained the -
operator, showing that it was necessary for x and y to exist prior to
the function call, but I haven't the faintest why this isn't working:

 myFunk-function(a,b,foo,bar) {foo-a+b; bar-a*b;}
 x-0; y-0;
 myFunk(4,5,x,y)
 x-0; y-0;
 myFunk(4,5,x,y)
 x
[1] 0
 y
[1] 0

What (no doubt simple) reason is there for x and y not changing?

Thank you,
cur
--
Curt Seeliger, Data Ranger
CSC, EPA/WED contractor
541/754-4638
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Calculation of group summaries

2005-07-14 Thread Seeliger . Curt
Several people suggested specific functions (by, tapply, sapply and
others); thanks for not blowing off a simple question regarding how to
do the following SQL in R:
   select year,
  site_id,
  visit_no,
  mean(undercut) AS meanUndercut,
  count(undercut) AS nUndercut,
  std(undercut) AS stdUndercut
   from channelMorphology
   group by year, site_id, visit_no
   ;

I'd spent quite a bit of time with the suggested functions earlier but
had no luck as I'd misread the docs and put the entire dataframe where
it only wants the columns to be processed.  Sometimes it's the simplest
of things.

This has lead to another confoundment-- sd() acts differently than
mean() for some reason, at least with R 1.9.0.  For some reason, means
generate NA results and a warning message for each group:

  argument is not numeric or logical: returning NA in:
mean.default(data[x, ], ...)

Of course, the argument is numeric, or there'd be no sd value.  Or more
likely, I'm still missing something really basic. If I wrap the value in
as.numeric() things work fine.  Why should I have to do this for mean
and median, but not sd? The code below should reproduce this error

  # Fake data for demo:
  nsites-6
  yearList-1999:2001
  fakesub-as.data.frame(cbind(
 year =rep(yearList,nsites/length(yearList),each=11)
,site_id  =rep(c('site1','site2'),each=11*nsites)
,visit_no =rep(1,11*2*nsites)
,transect =rep(LETTERS[1:11],nsites,each=2)
,transdir =rep(c('LF','RT'),11*nsites)
,undercut =abs(rnorm(11*2*nsites,10))
,angle=runif(11*2*nsites,0,180)
))

  # Create group summaries:
  sdmets-by(fakesub$undercut
,list(fakesub$year,fakesub$site_id,fakesub$visit_no)
,sd
)
  nmets-by(fakesub$undercut
   ,list(fakesub$year,fakesub$site_id,fakesub$visit_no)
   ,length
   )
  xmets-by(fakesub$undercut
   ,list(fakesub$year,fakesub$site_id,fakesub$visit_no)
   ,mean
   )
   xmets-by(as.numeric(fakesub$undercut)
   ,list(fakesub$year,fakesub$site_id,fakesub$visit_no)
   ,mean
   )

  # Put site id values (year, site_id and visit_no) into results:
  # List unique id combinations as a list of lists.  Then
  # reorganize that into 3 vectors for final results.
  # Certainly, there MUST be a better way...
  foo-strsplit(unique(paste(fakesub$year
,fakesub$site_id
,fakesub$visit_no
,sep='#'))
   ,split='#'
   )
  year-list()
  for(i in 1:length(foo)) {year-rbind(year,foo[[i]][1])}
  site_id-list()
  for(i in 1:length(foo)) {site_id-rbind(site_id,foo[[i]][2])}
  visit_no-list()
  for(i in 1:length(foo)) {visit_no-rbind(visit_no,foo[[i]][3])}

  # Final result, more or less
  data.frame(cbind(a=year,b=site_id,c=visit_no,sdmets,nmets,xmets))


cur

--
Curt Seeliger, Data Ranger
CSC, EPA/WED contractor
541/754-4638
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Calculation of group summaries

2005-07-12 Thread Seeliger . Curt
I know R has a steep learning curve, but from where I stand the slope
looks like a sheer cliff.  I'm pawing through the available docs and
have come across examples which come close to what I want but are
proving difficult for me to modify for my use.

Calculating simple group means is fairly straight forward:
  data(PlantGrowth)
  attach(PlantGrowth)
  stack(mean(unstack(PlantGrowth)))

I'd like to do something slightly more complex, using a data frame and
groups identified by unique combinations of three id variables.  There
may be thousands of such combinations in the data.  This is easy in SQL:

  select year,
 site_id,
 visit_no,
 mean(undercut) AS meanUndercut,
 count(undercut) AS nUndercut,
 std(undercut) AS stdUndercut
  from channelMorphology
  group by year, site_id, visit_no
  ;

Reading a CSV written by SAS and selecting only records expected to have
values is also straight forward in R, but getting those summary values
for each site visit is currently beyond me:

  sub-read.csv('c:/data/channelMorphology.csv'
   ,header=TRUE
   ,na.strings='.'
   ,sep=','
   ,strip.white=TRUE
   )

  undercut-subset(sub,
  ,TRANSDIR %in% c('LF','RT')

,select=c('YEAR','SITE_ID','VISIT_NO','TRANSECT','TRANSDIR'
   ,'UNDERCUT'
   )
  ,drop=TRUE
  )


Thanks all for your help.
cur
--
Curt Seeliger, Data Ranger
CSC, EPA/WED contractor
541/754-4638
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html