Re: [R] how to rewrite this without a loop ?

2004-11-19 Thread Stijn Lievens
Thomas Lumley wrote:
On Thu, 18 Nov 2004, Stijn Lievens wrote:

add.fun <- function(perf.data) {
  ss <- 0
  for (i in 0:29) {
  ss <- ss + cor(subset(perf.data, dataset == i)[3], 
subset(perf.data, dataset == i)[7], method = "kendall")
  }
  ss}


As one can see this function uses a for-loop.  Now chapter 9 of 'An 
introduction to R' tells us that we should avoid for-loops as much as 
possible.

You don't say whether `dataset' is the name of a column in `perf.data'. 
Assuming it is, and assuming that 0:29 are all the values of `dataset'

sum(by(perf.data, list(perf.data$dataset),
  function(d)  cor(d[,3],d[,7], method="kendall")))
would work.  
Indeed, this works.  The 'by' command is exactly what I was looking for.
As far as I can tell, this useful command it isn't mentioned in 'An 
introduction to R'.

If this is faster it will be because you don't call 
subset() twice per iteration, rather than because you are avoiding a 
loop.  However it has other benefits: it doesn't have the variable `i', 
it doesn't have to change the value of `ss', and it doesn't have the 
range of `dataset' hard-coded into it.  These are all clarity 
optimisations.

In fact I don't care too much about speed at the moment, but a one-line 
statement is more convenient to type (and recall) in the command line 
interface then a multi-line statmement.

Your solution really does the trick for me.  Thanks,
Stijn.

-thomas
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] how to rewrite this without a loop ?

2004-11-18 Thread Stijn Lievens
Stijn Lievens wrote:
Dear Rexperts,
First of all let me say that R is a wonderful and useful piece of software.
The only thing is that sometimes it takes me a long time to find out how 
something can be done, especially when aiming to write compact (and 
efficient) code.

For instance, I have the following function (very rudimentary) which 
takes a (very specific) data frame as input and for certain subsets
calculates the rank correlation between two corresponding columns.
The aim is to add all the rank correlations.


add.fun <- function(perf.data) {
   ss <- 0
   for (i in 0:29) {
   ss <- ss + cor(subset(perf.data, dataset == i)[3], 
subset(perf.data, dataset == i)[7], method = "kendall")
   }
   ss   
}


As one can see this function uses a for-loop.  Now chapter 9 of 'An 
introduction to R' tells us that we should avoid for-loops as much as 
possible.

Is there an obvious way to avoid this for-loop is this case ?
Using the lapply function in the e-mail of James, I came up with the 
following.


 sum (as.numeric( lapply( split(perf.data, perf.data$dataset), 
function(x) cor(x[3],x[7],method="kendall") ) ))


So, first I split the dataframe into a list of dataframes using split,
and using lapply I get a list of correlations, which I convert to
numeric and finally sum up.
I definitely avoided the for-loop in this way, although I am not sure 
whether this is more efficient or not.

Cheers,
Stijn.

I would like to see something in the lines of
(maple style)

add( seq(FUN(i), i = 0..29) )

Greetings
Stijn.


--
==
Dept. of Applied Mathematics and Computer Science, University of Ghent
Krijgslaan 281 - S9, B - 9000 Ghent, Belgium
Phone: +32-9-264.48.91, Fax: +32-9-264.49.95
E-mail: [EMAIL PROTECTED], URL: http://allserv.ugent.be/~slievens/
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] how to rewrite this without a loop ?

2004-11-18 Thread Stijn Lievens
Dear Rexperts,
First of all let me say that R is a wonderful and useful piece of 
software.

The only thing is that sometimes it takes me a long time to find out how 
something can be done, especially when aiming to write compact (and 
efficient) code.

For instance, I have the following function (very rudimentary) which 
takes a (very specific) data frame as input and for certain subsets
calculates the rank correlation between two corresponding columns.
The aim is to add all the rank correlations.


add.fun <- function(perf.data) {
   ss <- 0
   for (i in 0:29) {
   	ss <- ss + cor(subset(perf.data, dataset == i)[3], 
subset(perf.data, dataset == i)[7], method = "kendall")
   }
   ss	
}


As one can see this function uses a for-loop.  Now chapter 9 of 'An 
introduction to R' tells us that we should avoid for-loops as much as 
possible.

Is there an obvious way to avoid this for-loop is this case ?
I would like to see something in the lines of
(maple style)

add( seq(FUN(i), i = 0..29) )

Greetings
Stijn.
--
==
Dept. of Applied Mathematics and Computer Science, University of Ghent
Krijgslaan 281 - S9, B - 9000 Ghent, Belgium
Phone: +32-9-264.48.91, Fax: +32-9-264.49.95
E-mail: [EMAIL PROTECTED], URL: http://allserv.ugent.be/~slievens/
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html