[R] Calculating mean together with split

2006-09-20 Thread Rainer M Krug
Hi

I have a table called npl containing results of simulations.

It contains about 19000 entries and the structure looks like this:

  NoPlants  sim run year DensPlants
16 lng_cs99_renosterbos   140.00192
.
.
.


it has 43 different entries for sim and year goes from 1 to 100, and run 
from 1 to 5.

I would like to calculate the mean of DensPlants for each simulation and 
each year seperately, i.e. calculating the mean for all combinations of 
sim and year over run.

I can use

split(npl, npl$sim)

to split npl into different groups each containing the entries for one 
parameterset - but where to go from there?

Rainer

-- 
Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation
Biology (UCT)

Department of Conservation Ecology and Entomology
University of Stellenbosch
Matieland 7602
South Africa

Tel:+27 - (0)72 808 2975 (w)
Fax:+27 - (0)21 808 3304
Cell:   +27 - (0)83 9479 042

email:  [EMAIL PROTECTED]
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculating mean together with split

2006-09-20 Thread David Barron
Have a look at the function aggregate.table in the package gtools
(part of the gregmisc bundle).

On 20/09/06, Rainer M Krug [EMAIL PROTECTED] wrote:
 Hi

 I have a table called npl containing results of simulations.

 It contains about 19000 entries and the structure looks like this:

   NoPlants  sim run year DensPlants
 16 lng_cs99_renosterbos   140.00192
 .
 .
 .


 it has 43 different entries for sim and year goes from 1 to 100, and run
 from 1 to 5.

 I would like to calculate the mean of DensPlants for each simulation and
 each year seperately, i.e. calculating the mean for all combinations of
 sim and year over run.

 I can use

 split(npl, npl$sim)

 to split npl into different groups each containing the entries for one
 parameterset - but where to go from there?

 Rainer

 --
 Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation
 Biology (UCT)

 Department of Conservation Ecology and Entomology
 University of Stellenbosch
 Matieland 7602
 South Africa

 Tel:+27 - (0)72 808 2975 (w)
 Fax:+27 - (0)21 808 3304
 Cell:   +27 - (0)83 9479 042

 email:  [EMAIL PROTECTED]
 [EMAIL PROTECTED]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
=
David Barron
Said Business School
University of Oxford
Park End Street
Oxford OX1 1HP

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculating mean together with split

2006-09-20 Thread David Barron
Sorry, that should have been package gdata, not gtools...they're both
in the same bundle, though.

On 20/09/06, Rainer M Krug [EMAIL PROTECTED] wrote:
 Hi

 I have a table called npl containing results of simulations.

 It contains about 19000 entries and the structure looks like this:

   NoPlants  sim run year DensPlants
 16 lng_cs99_renosterbos   140.00192
 .
 .
 .


 it has 43 different entries for sim and year goes from 1 to 100, and run
 from 1 to 5.

 I would like to calculate the mean of DensPlants for each simulation and
 each year seperately, i.e. calculating the mean for all combinations of
 sim and year over run.

 I can use

 split(npl, npl$sim)

 to split npl into different groups each containing the entries for one
 parameterset - but where to go from there?

 Rainer

 --
 Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation
 Biology (UCT)

 Department of Conservation Ecology and Entomology
 University of Stellenbosch
 Matieland 7602
 South Africa

 Tel:+27 - (0)72 808 2975 (w)
 Fax:+27 - (0)21 808 3304
 Cell:   +27 - (0)83 9479 042

 email:  [EMAIL PROTECTED]
 [EMAIL PROTECTED]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
=
David Barron
Said Business School
University of Oxford
Park End Street
Oxford OX1 1HP

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculating mean together with split

2006-09-20 Thread David Barron
Of course, aggregate will work too, depends on how you want the output
to be formatted.  You could also look at summarize in the Hmisc
package.

On 20/09/06, David Barron [EMAIL PROTECTED] wrote:
 Sorry, that should have been package gdata, not gtools...they're both
 in the same bundle, though.

 On 20/09/06, Rainer M Krug [EMAIL PROTECTED] wrote:
  Hi
 
  I have a table called npl containing results of simulations.
 
  It contains about 19000 entries and the structure looks like this:
 
NoPlants  sim run year DensPlants
  16 lng_cs99_renosterbos   140.00192
  .
  .
  .
 
 
  it has 43 different entries for sim and year goes from 1 to 100, and run
  from 1 to 5.
 
  I would like to calculate the mean of DensPlants for each simulation and
  each year seperately, i.e. calculating the mean for all combinations of
  sim and year over run.
 
  I can use
 
  split(npl, npl$sim)
 
  to split npl into different groups each containing the entries for one
  parameterset - but where to go from there?
 
  Rainer
 
  --
  Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation
  Biology (UCT)
 
  Department of Conservation Ecology and Entomology
  University of Stellenbosch
  Matieland 7602
  South Africa
 
  Tel:+27 - (0)72 808 2975 (w)
  Fax:+27 - (0)21 808 3304
  Cell:   +27 - (0)83 9479 042
 
  email:  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


 --
 =
 David Barron
 Said Business School
 University of Oxford
 Park End Street
 Oxford OX1 1HP



-- 
=
David Barron
Said Business School
University of Oxford
Park End Street
Oxford OX1 1HP

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculating mean together with split

2006-09-20 Thread Rainer M Krug
Hi David

aggregate is what I was looking for, as I wanted to have it in the 
tabular format to plot it.

Thanks

Rainer

David Barron wrote:
 Of course, aggregate will work too, depends on how you want the output
 to be formatted.  You could also look at summarize in the Hmisc
 package.
 
 On 20/09/06, David Barron [EMAIL PROTECTED] wrote:
 Sorry, that should have been package gdata, not gtools...they're both
 in the same bundle, though.

 On 20/09/06, Rainer M Krug [EMAIL PROTECTED] wrote:
  Hi
 
  I have a table called npl containing results of simulations.
 
  It contains about 19000 entries and the structure looks like this:
 
NoPlants  sim run year DensPlants
  16 lng_cs99_renosterbos   140.00192
  .
  .
  .
 
 
  it has 43 different entries for sim and year goes from 1 to 100, and 
 run
  from 1 to 5.
 
  I would like to calculate the mean of DensPlants for each simulation 
 and
  each year seperately, i.e. calculating the mean for all combinations of
  sim and year over run.
 
  I can use
 
  split(npl, npl$sim)
 
  to split npl into different groups each containing the entries for one
  parameterset - but where to go from there?
 
  Rainer
 
  --
  Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation
  Biology (UCT)
 
  Department of Conservation Ecology and Entomology
  University of Stellenbosch
  Matieland 7602
  South Africa
 
  Tel:+27 - (0)72 808 2975 (w)
  Fax:+27 - (0)21 808 3304
  Cell:   +27 - (0)83 9479 042
 
  email:  [EMAIL PROTECTED]
  [EMAIL PROTECTED]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


 -- 
 =
 David Barron
 Said Business School
 University of Oxford
 Park End Street
 Oxford OX1 1HP

 
 

-- 
Rainer M. Krug, Dipl. Phys. (Germany), MSc Conservation
Biology (UCT)

Department of Conservation Ecology and Entomology
University of Stellenbosch
Matieland 7602
South Africa

Tel:+27 - (0)72 808 2975 (w)
Fax:+27 - (0)21 808 3304
Cell:   +27 - (0)83 9479 042

email:  [EMAIL PROTECTED]
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculating mean together with split

2006-09-20 Thread hadley wickham
 It contains about 19000 entries and the structure looks like this:

   NoPlants  sim run year DensPlants
 16 lng_cs99_renosterbos   140.00192
 .
 .
 .

 it has 43 different entries for sim and year goes from 1 to 100, and run
 from 1 to 5.

 I would like to calculate the mean of DensPlants for each simulation and
 each year seperately, i.e. calculating the mean for all combinations of
 sim and year over run.

You can do this pretty easily with the reshape package:

library(reshape)
dfm - rename(df, c(DensPlants = value)) # this is the form that reshape wants

# Then try one of these:

cast(dfm, year ~ sim)
cast(dfm, year + sim ~ . )
cast(dfm, year ~ sim, margins=TRUE)

Depending on what format you want the resulting summaries in.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculating mean together with split

2006-09-20 Thread hadley wickham
 # Then try one of these:

 cast(dfm, year ~ sim)
 cast(dfm, year + sim ~ . )
 cast(dfm, year ~ sim, margins=TRUE)

Oops that should be:

dfm - rename(df, c(DensPlants = value))

cast(dfm, year ~ sim, mean)
cast(dfm, year + sim ~ . , mean)
cast(dfm, year ~ sim, mean, margins=TRUE)

(Thanks for pointing that out Gabor!)

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.