Re: [R] How to test if a slope is different than 1?

2012-04-26 Thread Mark Na
Hi Greg and others,

Thanks for your replies. Okay, I'm convinced that the offset is the best
approach and wonder if you might have a quick look at what I did.

Here's the original model containing the slope (0.56) that I'd like to test
if it's different from 1.0

model1 - glm(log(data$AB.obs+1,10) ~ log(data$SIZE,10) + data$YEAR)

and its coefficients:
   Estimate Std. Error t value Pr(|t|)
(Intercept)-1.182530.09119 -12.967   2e-16 ***
log(data$SIZE, 10)  0.560010.02564  21.843   2e-16 ***
data$YEAR2008   0.168230.04366   3.853 0.000152 ***
data$YEAR2009   0.202990.04707   4.313 0.24 ***

And here's the model with an offset term:

model2 - glm(log(data$AB.obs+1,10) ~ log(data$SIZE,10) +
offset(log(data$SIZE,10)) + data$YEAR)

and its coefficients:
   Estimate Std. Error t value Pr(|t|)
(Intercept)-1.182530.09119 -12.967   2e-16 ***
log(data$SIZE, 10) -0.439990.02564 -17.162   2e-16 ***
data$YEAR2008   0.168230.04366   3.853 0.000152 ***
data$YEAR2009   0.202990.04707   4.313 0.24 ***

So, if I understand correctly, the small P-value corresponding to the SIZE
coefficient in model2 indicates that the slope of 0.56 in model1 is
significantly different from 1.0, right?

If I may ask one more question: could I use the offset to test if the slope
of 0.56 is different from yet another value, e.g., 0.5?

Much appreciated.

Many thanks, Mark Na




On Wed, Apr 25, 2012 at 3:27 PM, Greg Snow 538...@gmail.com wrote:

 Doesn't the p-value from using offset work for you?  if you really
 need a p-value.  The confint method is a quick and easy way to see if
 it is significantly different from 1 (see Rolf's response), but does
 not provide an exact p-value.  I guess you could do confidence
 intervals at different confidence levels until you find the level such
 that one of the limits is close enough to 1, but that seems like way
 to much work.  You could also compute the p-value by taking the slope
 minus 1 divided by the standard error and plug that into the pt
 function with the correct degrees of freedom.  You could even write a
 function to do that for you, but it still seems more work than adding
 the offset to the formula.

 On Tue, Apr 24, 2012 at 8:17 AM, Mark Na mtb...@gmail.com wrote:
  Hi Greg. Thanks for your reply. Do you know if there is a way to use the
  confint function to get a p-value on this test?
 
  Thanks, Mark
 
 
 
  On Mon, Apr 23, 2012 at 3:10 PM, Greg Snow 538...@gmail.com wrote:
 
  One option is to subtract the continuous variable from y before doing
  the regression (this works with any regression package/function).  The
  probably better way in R is to use the 'offset' function:
 
  formula = I(log(data$AB.obs + 1, 10)-log(data$SIZE,10)) ~
  log(data$SIZE, 10) + data$Y
  formula = log(data$AB.obs + 1) ~ offset( log(data$SIZE,10) ) +
  log(data$SIZE,10) + data$Y
 
  Or you can use a function like 'confint' to find the confidence
  interval for the slope and see if 1 is in the interval.
 
  On Mon, Apr 23, 2012 at 12:11 PM, Mark Na mtb...@gmail.com wrote:
   Dear R-helpers,
  
   I would like to test if the slope corresponding to a continuous
 variable
   in
   my model (summary below) is different than one.
  
   I would appreciate any ideas for how I could do this in R, after
 having
   specified and run this model?
  
   Many thanks,
  
   Mark Na
  
  
  
   Call:
   lm(formula = log(data$AB.obs + 1, 10) ~ log(data$SIZE, 10) +
 data$Y)
  
   Residuals:
  Min   1Q   Median   3Q  Max
   -0.94368 -0.13870  0.04398  0.17825  0.63365
  
   Coefficients:
Estimate Std. Error t value  Pr(|t|)
   (Intercept)-1.182820.09120 -12.9702e-16 ***
   log(data$SIZE, 10)  0.560090.02564  21.8462e-16 ***
   data$Y2008  0.168250.04366   3.854  0.000151 ***
   data$Y2009  0.203100.04707   4.315 0.238 ***
   ---
   Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
  
   Residual standard error: 0.2793 on 228 degrees of freedom
   Multiple R-squared: 0.6768, Adjusted R-squared: 0.6726
   F-statistic: 159.2 on 3 and 228 DF,  p-value:  2.2e-16
  
  [[alternative HTML version deleted]]
  
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
 
 
 
  --
  Gregory (Greg) L. Snow Ph.D.
  538...@gmail.com
 
 



 --
 Gregory (Greg) L. Snow Ph.D.
 538...@gmail.com


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained

Re: [R] How to test if a slope is different than 1?

2012-04-24 Thread Mark Na
Hi Greg. Thanks for your reply. Do you know if there is a way to use the
confint function to get a p-value on this test?

Thanks, Mark


On Mon, Apr 23, 2012 at 3:10 PM, Greg Snow 538...@gmail.com wrote:

 One option is to subtract the continuous variable from y before doing
 the regression (this works with any regression package/function).  The
 probably better way in R is to use the 'offset' function:

 formula = I(log(data$AB.obs + 1, 10)-log(data$SIZE,10)) ~
 log(data$SIZE, 10) + data$Y
 formula = log(data$AB.obs + 1) ~ offset( log(data$SIZE,10) ) +
 log(data$SIZE,10) + data$Y

 Or you can use a function like 'confint' to find the confidence
 interval for the slope and see if 1 is in the interval.

 On Mon, Apr 23, 2012 at 12:11 PM, Mark Na mtb...@gmail.com wrote:
  Dear R-helpers,
 
  I would like to test if the slope corresponding to a continuous variable
 in
  my model (summary below) is different than one.
 
  I would appreciate any ideas for how I could do this in R, after having
  specified and run this model?
 
  Many thanks,
 
  Mark Na
 
 
 
  Call:
  lm(formula = log(data$AB.obs + 1, 10) ~ log(data$SIZE, 10) +
data$Y)
 
  Residuals:
 Min   1Q   Median   3Q  Max
  -0.94368 -0.13870  0.04398  0.17825  0.63365
 
  Coefficients:
   Estimate Std. Error t value  Pr(|t|)
  (Intercept)-1.182820.09120 -12.9702e-16 ***
  log(data$SIZE, 10)  0.560090.02564  21.8462e-16 ***
  data$Y2008  0.168250.04366   3.854  0.000151 ***
  data$Y2009  0.203100.04707   4.315 0.238 ***
  ---
  Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 
  Residual standard error: 0.2793 on 228 degrees of freedom
  Multiple R-squared: 0.6768, Adjusted R-squared: 0.6726
  F-statistic: 159.2 on 3 and 228 DF,  p-value:  2.2e-16
 
 [[alternative HTML version deleted]]
 
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 Gregory (Greg) L. Snow Ph.D.
 538...@gmail.com


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to test if a slope is different than 1?

2012-04-23 Thread Mark Na
Dear R-helpers,

I would like to test if the slope corresponding to a continuous variable in
my model (summary below) is different than one.

I would appreciate any ideas for how I could do this in R, after having
specified and run this model?

Many thanks,

Mark Na



Call:
lm(formula = log(data$AB.obs + 1, 10) ~ log(data$SIZE, 10) +
   data$Y)

Residuals:
Min   1Q   Median   3Q  Max
-0.94368 -0.13870  0.04398  0.17825  0.63365

Coefficients:
  Estimate Std. Error t value  Pr(|t|)
(Intercept)-1.182820.09120 -12.9702e-16 ***
log(data$SIZE, 10)  0.560090.02564  21.8462e-16 ***
data$Y2008  0.168250.04366   3.854  0.000151 ***
data$Y2009  0.203100.04707   4.315 0.238 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2793 on 228 degrees of freedom
Multiple R-squared: 0.6768, Adjusted R-squared: 0.6726
F-statistic: 159.2 on 3 and 228 DF,  p-value:  2.2e-16

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with plotting a square 1 x 3 plot and placement of outer margin text

2012-02-07 Thread Mark Na
Dear R-helpers,

Please see the attached plot.

The problem is that I have too much space between the x-axis label
(which is mtext in an outer margin) and the plots.

My par settings for this plot are:

par(mfrow=c(1,3),oma=c(2,2,2,2),mar=c(5.1,4.1,4.1,2.1),pty=s)
#here is the code that produces the three plots, which I have deleted for 
simplicity
mtext(Log Wetland Area,side=1,outer=TRUE)

It works fine (less space between plots and outer margin text)) when I
set pty=m but then I get very long and skinny rectangular plots. I
would like to keep the square plots.

Any help would be much appreciated!

Many thanks,

Mark
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Post hoc test for lm() or glm() ?

2012-02-02 Thread Mark Na
Hi R-helpers,

TukeyHSD() works for models fitted with aov(), but could anyone point
me to a function that performs a similar post hoc test for models
fitted with lm() or glm()?

Thanks in advance,

Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Post hoc test for lm() or glm() ?

2012-02-02 Thread Mark Na
Thank you Richard and Frank for your very quick and helpful replies.

Cheers, Mark


On Thu, Feb 2, 2012 at 2:58 PM, Frank Harrell f.harr...@vanderbilt.edu wrote:
 The R multcomp package provides one general approach to multiplicity
 correction.  For general contrasts in lm and glm, the rms package's ols and
 Glm functions make this even easier to use.
 Frank

 Mark Na wrote

 Hi R-helpers,

 TukeyHSD() works for models fitted with aov(), but could anyone point
 me to a function that performs a similar post hoc test for models
 fitted with lm() or glm()?

 Thanks in advance,

 Mark

 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 -
 Frank Harrell
 Department of Biostatistics, Vanderbilt University
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Post-hoc-test-for-lm-or-glm-tp4352761p4352799.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Post-hoc test on ANCOVA

2012-02-01 Thread Mark Na
Dear R-helpers,

I have an ANCOVA with a significant effect of the factor, which has
three levels. I wish to determine which of the levels are different
from each other but, because my model was fitted with lm(), I cannot
use TukeyHSD.

For some reason, I get different results (no significant effect of the
factor) when I fit the model using aov() so, for the moment, I am
using lm().

Could anyone point me to a test and associated R function that will
work on a fitted lm() or glm()?

Many thanks,

Mark Na

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to write a list object's name to a new dataframe in that list object

2011-10-07 Thread Mark Na
Hello R-helpers,

I have a list that only contains dataframes. Each element of the list
(i.e., each dataframe) has a unique name (one through ten). I wish
to add a new column (called NAME) to each list element (i.e each
datarame) and I want that column to contain the name of it's list
element.

e.g. the list element (i.e., dataframe) called one would get a new
column called NAME that would contain the word one in every row.

Could anyone help with that?

Many thanks,

Mark Na

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Poisson GLM with a logged dependent variable...just asking for trouble?

2011-07-01 Thread Mark Na
Dear R-helpers,

I'm using a GLM with poisson errors to model integer count data as a
function of one non-integer covariate.

The model formula is: log(DV) ~ glm(log(IV,10),family=poisson).

I'm getting a warning because the logged DV is no longer an integer.

I have three questions:

1) Can I ignore the warning, or is logging the DV (resulting in
non-integers) a serious violation of the Poisson error structure?

2) If the answer to #1 is no, don't ignore it, it's serious then can
I use a quasipoisson error structure instead (does not  give the same
warning) and if so are there any pitfalls to using the quasipoisson
model? Are there any better alternatives for count data where the
counts must be logged? Or, should I just abandon logging the DV? In
that case, how could I compare the fit of a Poisson model (without
logging the DV) to that of a GLM with normal errors (with a logged
DV). AIC would not be valid because the DVs are different, right?

3) The quasipoisson model doesn't return an AIC value. Why, and is
there anything I can do to calculate AIC manually, that would allow me
to compare this model to other models?

Many thanks in advance for your help!

Cheers, Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Compare two dataframes

2010-12-17 Thread Mark Na
Hi Petr,

Many thanks for your help. I like your solution because (and I did not
know this) the unique function works on ALL the data at once (i.e.,
across all of the columns) which means I don't have to make a unique
ID field by pasting together all of the rows or run through all of the
columns iteratively (say, by using a loop).

However, if the dataframe contains non-unique rows (two rows with
exactly the same values in each column) then the unique function will
delete one of them and that may not be desirable. So, caution is
required.

Thanks again for the time you took to help me better understand the
unique function. Much appreciated. Děkuji!

Mark



On Fri, Dec 17, 2010 at 2:27 AM, Petr Savicky savi...@cs.cas.cz wrote:
 On Thu, Dec 16, 2010 at 01:02:29PM -0600, Mark Na wrote:
 Hello,

 I have two dataframes DF1 and DF2 that should be identical but are not
 (DF1 has some rows that aren't in DF2, and vice versa). I would like
 to produce a new dataframe DF3 containing rows in DF1 that aren't in
 DF2 (and similarly DF4 would contain rows in DF2 that aren't in DF1).

 The function unique(DF) removes duplicated rows of DF and keeps the unique
 rows in the order of their first occurrence. So, if DF1 does not contain
 duplicated rows, then unique(rbind(DF1, DF2)) contains first DF1 and
 then the rows, which are unique to DF2, if there are any. The order of
 the rows in the result depends on the order of the original data frames
 and if DF2 contains several instances of a row, which is not in DF1, we
 get only the first instance of this row in the difference.

  #MAKE SOME DATA
  cars$id - paste(cars$speed, cars$dist, sep=) #create unique ID field by 
 pasting all columns together
  cars1 - cars[1:35, ]
  cars2 - cars[16:50, ]

  #EXTRACT UNIQUE ROWS
  cars1_unique - cars1[cars1$id %in% setdiff(cars1$id, cars2$id), ] #rows 
 unique to cars1 (i.e., not in cars2)
  cars2_unique - cars2[cars2$id %in% setdiff(cars2$id, cars1$id), ] #rows 
 unique to cars2

  cars1_set - unique(cars1)
  cars2_set - unique(cars2)

  cars1_plus - unique(rbind(cars1_set, cars2_set))
  cars2_plus - unique(rbind(cars2_set, cars1_set))

  cars1_diff - cars2_plus[ - seq(nrow(cars2_set)), ]
  cars2_diff - cars1_plus[ - seq(nrow(cars1_set)), ]

  all(cars1_unique == cars1_diff) # [1] TRUE
  all(cars2_unique == cars2_diff) # [1] TRUE

 Petr Savicky.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Compare two dataframes

2010-12-16 Thread Mark Na
Hello,

I have two dataframes DF1 and DF2 that should be identical but are not
(DF1 has some rows that aren't in DF2, and vice versa). I would like
to produce a new dataframe DF3 containing rows in DF1 that aren't in
DF2 (and similarly DF4 would contain rows in DF2 that aren't in DF1).

I have a solution for this problem (see self contained example below)
but it's awkward and requires making a new ID column by pasting
together all of the columns in each DF and them comparing the two DFs
based on this unique ID.

Is there a better way?

Many thanks for your help,

Mark



#compare two dataframes and extract uncommon rows

#MAKE SOME DATA
cars$id-paste(cars$speed,cars$dist,sep=) $create unique ID field by
pasting all columns together
cars1-cars[1:35,]
cars2-cars[16:50,]

#EXTRACT UNIQUE ROWS
cars1_unique-cars1[cars1$id %in% setdiff(cars1$id,cars2$id),] #rows
unique to cars1 (i.e., not in cars2)
cars2_unique-cars2[cars2$id %in% setdiff(cars2$id,cars1$id),] #rows
unique to cars2

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to left or right truncate a character string?

2010-12-14 Thread Mark Na
Hi R-helpers,

I have a character string, for example:

lm(y ~ X2 + X3 + X4)

from which I would like to strip off the leading and trailing
quotation marks resulting in this:

lm(y ~ X2 + X3 + X4)


I have tried using gsub() but I can't figure out how to specify the
quotation mark using a regular expression.

Alternatively, I would like a function that lets me delete the leading
(or trailing) X characters, and in this case X=1 (but it could be used
more flexibly to delete several leading or trailing characters).

I would appreciate help with either of these potential solutions (gsub
and regex, or delete leading/trailing characters).

Many thanks!

Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to bind models into a list of models?

2010-12-14 Thread Mark Na
Hi R-helpers,

I have a character object called dd that has 32 elements each of which
is a model formula contained within quotation marks. Here's what it
looks like:

 dd
 [1] lm(y ~ 1,data=Cement) lm(y ~
X,data=Cement) lm(y ~ X1,data=Cement)
 [4] lm(y ~ X2,data=Cement)lm(y ~
X3,data=Cement)lm(y ~ X4,data=Cement)
 [7] lm(y ~ X + X1,data=Cement)lm(y ~ X +
X2,data=Cement)lm(y ~ X + X3,data=Cement)
[10] lm(y ~ X + X4,data=Cement)lm(y ~ X1 +
X2,data=Cement)   lm(y ~ X1 + X3,data=Cement)
[13] lm(y ~ X1 + X4,data=Cement)   lm(y ~ X2 +
X3,data=Cement)   lm(y ~ X2 + X4,data=Cement)
[16] lm(y ~ X3 + X4,data=Cement)   lm(y ~ X + X1 +
X2,data=Cement)   lm(y ~ X + X1 + X3,data=Cement)
[19] lm(y ~ X + X1 + X4,data=Cement)   lm(y ~ X + X2 +
X3,data=Cement)   lm(y ~ X + X2 + X4,data=Cement)
[22] lm(y ~ X + X3 + X4,data=Cement)   lm(y ~ X1 + X2 +
X3,data=Cement)  lm(y ~ X1 + X2 + X4,data=Cement)
[25] lm(y ~ X1 + X3 + X4,data=Cement)  lm(y ~ X2 + X3 +
X4,data=Cement)  lm(y ~ X + X1 + X2 + X3,data=Cement)
[28] lm(y ~ X + X1 + X2 + X4,data=Cement)  lm(y ~ X + X1 + X3 +
X4,data=Cement)  lm(y ~ X + X2 + X3 + X4,data=Cement)
[31] lm(y ~ X1 + X2 + X3 + X4,data=Cement) lm(y ~ X + X1 + X2 +
X3 + X4,data=Cement)

I would like to convert this object into a list called Cand.models
with 32 list elements each of which would contain one of the above
model formulae. When I print the list, the models should run, so the
first few elements of the list would look like this (see below output
from a list I created by hand).

Many thanks for any help you can provide!

Mark




Cand.models

[[1]]

Call:
lm(formula = y ~ 1, data = Cement)

Coefficients:
(Intercept)
  95.42


[[2]]

Call:
lm(formula = y ~ X, data = Cement)

Coefficients:
(Intercept)X
 82.3081.874


[[3]]

Call:
lm(formula = y ~ X1, data = Cement)

Coefficients:
(Intercept)   X1
 81.4791.869

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to bind models into a list of models?

2010-12-14 Thread Mark Na
Many thanks Phil. This is perfect. I usually forget about lapply and
try something more complicated. Your solution works really well.

Best, Mark


On Tue, Dec 14, 2010 at 3:45 PM, Phil Spector spec...@stat.berkeley.edu wrote:
 Mark -
   I believe

  lapply(dd,function(m)eval(parse(text=m)))

 will do what you want.
                                        - Phil Spector
                                         Statistical Computing Facility
                                         Department of Statistics
                                         UC Berkeley
                                         spec...@stat.berkeley.edu


 On Tue, 14 Dec 2010, Mark Na wrote:

 Hi R-helpers,

 I have a character object called dd that has 32 elements each of which
 is a model formula contained within quotation marks. Here's what it
 looks like:

 dd

 [1] lm(y ~ 1,data=Cement)                     lm(y ~
 X,data=Cement)                     lm(y ~ X1,data=Cement)
 [4] lm(y ~ X2,data=Cement)                    lm(y ~
 X3,data=Cement)                    lm(y ~ X4,data=Cement)
 [7] lm(y ~ X + X1,data=Cement)                lm(y ~ X +
 X2,data=Cement)                lm(y ~ X + X3,data=Cement)
 [10] lm(y ~ X + X4,data=Cement)                lm(y ~ X1 +
 X2,data=Cement)               lm(y ~ X1 + X3,data=Cement)
 [13] lm(y ~ X1 + X4,data=Cement)               lm(y ~ X2 +
 X3,data=Cement)               lm(y ~ X2 + X4,data=Cement)
 [16] lm(y ~ X3 + X4,data=Cement)               lm(y ~ X + X1 +
 X2,data=Cement)           lm(y ~ X + X1 + X3,data=Cement)
 [19] lm(y ~ X + X1 + X4,data=Cement)           lm(y ~ X + X2 +
 X3,data=Cement)           lm(y ~ X + X2 + X4,data=Cement)
 [22] lm(y ~ X + X3 + X4,data=Cement)           lm(y ~ X1 + X2 +
 X3,data=Cement)          lm(y ~ X1 + X2 + X4,data=Cement)
 [25] lm(y ~ X1 + X3 + X4,data=Cement)          lm(y ~ X2 + X3 +
 X4,data=Cement)          lm(y ~ X + X1 + X2 + X3,data=Cement)
 [28] lm(y ~ X + X1 + X2 + X4,data=Cement)      lm(y ~ X + X1 + X3 +
 X4,data=Cement)      lm(y ~ X + X2 + X3 + X4,data=Cement)
 [31] lm(y ~ X1 + X2 + X3 + X4,data=Cement)     lm(y ~ X + X1 + X2 +
 X3 + X4,data=Cement)

 I would like to convert this object into a list called Cand.models
 with 32 list elements each of which would contain one of the above
 model formulae. When I print the list, the models should run, so the
 first few elements of the list would look like this (see below output
 from a list I created by hand).

 Many thanks for any help you can provide!

 Mark




 Cand.models

 [[1]]

 Call:
 lm(formula = y ~ 1, data = Cement)

 Coefficients:
 (Intercept)
     95.42


 [[2]]

 Call:
 lm(formula = y ~ X, data = Cement)

 Coefficients:
 (Intercept)            X
    82.308        1.874


 [[3]]

 Call:
 lm(formula = y ~ X1, data = Cement)

 Coefficients:
 (Intercept)           X1
    81.479        1.869

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to plot effect of x1 while controlling for x2

2010-11-15 Thread Mark Na
Hello R-helpers,

Please see a self-contained example below, in which I attempt to plot
the effect of x1 on y, while controlling for x2.

Is there a function that does the same thing, without having to
specify that x2 should be held at its mean value? It works fine for
this simple example, but might be cumbersome if the model was more
complex (e.g., lots of x variables, and/or interactions).

Many thanks,

Mark


#make some random data
x1-rnorm(100)
x2-rnorm(100,2,1)
y-0.75*x1+0.35*x2

#fit a model
model1-lm(y~x1+x2)

#predict the effect of x1 on y, while controlling for x2
xv1-seq(min(x1),max(x1),0.1)
yhat_x1-predict(model1,list(x1=xv1,x2=rep(mean(x2),length(xv1))),type=response)

#plot the predicted values
plot(y~x1,xlim=c(min(x1),max(x1)), ylim=c(min(y),max(y)))
lines(xv1,yhat_x1)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Please help with min()

2010-04-17 Thread Mark Na
Hello,

I have two vectors of length = 10

x-c(2,14,79,27,3,126,15,1,12,4)
y-rep(4,10)

and I would like to create a third vector of length = 10 that contains the
smallest value at each position in the two above vectors.

I have tried:

z-min(x,y)

but that doesn't work.

With the example data above, the third vector would look like this.

 z
 [1]   2  4  4  4   3 4  4   1  4   4

Any help with this would be much appreciated, thanks!

mark

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Conditional replacement of NA depending on value in the previous column

2010-03-23 Thread Mark Na
Dear R-helpers,

I have a dataframe like this:

ID  X1  X2  X3  X4  X5  X6
49  1   1   1   0   NA  NA
50  1   1   1   1   NA  1

I would like to convert a missing value (NA) that follows a 0 (zero) or
another missing value (NA) into a 0 (zero).

So, the above lines would be converted to:

ID  X1  X2  X3  X4  X5  X6
49  1   1   1   0   0   0
50  1   1   1   1   NA  1

I have been struggling with this all morning, so any help you could provide
would be much appreciated.

Thank you!

Mark Na

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Please help with loop, thanks

2010-03-18 Thread Mark Na
Dear R helpers,

I would like to write a loop that makes 4 objects (called A, B, C, and D)
each of which contains ten random numbers.

This attempt:

individuals-c(A,B,C,D)
for(i in 1:length(individuals)) {
individuals[i]-rnorm(10)
}


does not work because individuals[i] is not the proper way to extract each
letter from the object called individuals (rather, it tries to assign the
random numbers to various positions within individual)

So, my question is, what should be to the left of the gets operator in the
third line?

Many thanks,

Mark Na

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Please help with loop, thanks

2010-03-18 Thread Mark Na
Many thanks to everyone who helped me solve this problem.

I think I must have described my problem poorly, but Phil, Patrick and Jim
were able to see through the haze and suggest that I use a list to contain
the output from my loop. This solution works very well.

Thanks again for your help, Mark




On Thu, Mar 18, 2010 at 12:44 PM, Mark Na mtb...@gmail.com wrote:

 Dear R helpers,

 I would like to write a loop that makes 4 objects (called A, B, C, and D)
 each of which contains ten random numbers.

 This attempt:

 individuals-c(A,B,C,D)
 for(i in 1:length(individuals)) {
 individuals[i]-rnorm(10)
 }


 does not work because individuals[i] is not the proper way to extract
 each letter from the object called individuals (rather, it tries to assign
 the random numbers to various positions within individual)

 So, my question is, what should be to the left of the gets operator in the
 third line?

 Many thanks,

 Mark Na




-- 
Mark Na
University of Saskatchewan
Saskatoon, Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Please help with a basic function

2009-12-11 Thread Mark Na
Hello,

I am learning how to use functions, but I'm running into a roadblock.

I would like my function to do two things: 1) convert an object to a
dataframe, 2) and then subset the dataframe. Both of these commands work
fine outside the function, but I would like to wrap them in a function so I
can apply the code iteratively to many such objects.

Here's what I wrote, but it doesn't work:

convert-function(d) {
 d-data.frame(d); #convert object to dataframe
 d-subset(d,select=c(time,coords.x1,coords.x2)) #select some columns
}
convert(data) #the problem is that data is the same as it was before
running the function

The objects being processed through my function are SpatialPointsDataFrames
but I'm quite sure that's not my problem, as I can process these outside of
the function (using the above code) ... it's when I try to wrap the code in
a function that it doesn't work.

Thanks, Mark

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Please help with a basic function

2009-12-11 Thread Mark Na
Many thanks for the replies to my call for help this morning. I didn't know
about return() and that helped quite a bit.

Best, Mark

On Fri, Dec 11, 2009 at 10:00 AM, Paul Hiemstra p.hiems...@geo.uu.nlwrote:

 Hi Mark,

 This question would probably be better suited for the r-sig-geo mailing
 list. In addition, please read the posting guide and provide a piece of code
 that reproduces the problem.

 library(sp)


 convert-function(d) {
 d-data.frame(d); #convert object to dataframe
 d-subset(d,select=c(zinc,x,y)) #select some columns
  d # - add this, or alternatively 'return(d)'
 }

 data(meuse)
 coordinates(meuse) = ~x+y

 convert(meuse)

 But maybe better, subsetting a SPDF can be done using:

 meuse[zinc] # Remains an SPDF
 # Returns a data.frame
 data.frame(coordinates(meuse), zinc = meuse$zinc)

 And some unrequested advice :). To process multiple files, take a look at
 lapply, both for reading and processing.

 all_data = lapply(list_of_files, function(file) {
bla = read.table(file)
coordinates(bla) = ~coor.x1 + coor.x2
return(bla)
 }
 # all data is now a list wit the SPDF's

 processed_data = lapply(all_data, function(dat) {
  return(data.frame(coordinates(dat), zinc = dat$zinc))
 }

 ofcourse you can include the latter lapply stuff inside the first 'loading'
 lapply.

 all_data = lapply(list_of_files, function(file) {
bla = read.table(file)
bla = subset(bla, select = select=c(time,coords.x1,coords.x2))
coordinates(bla) = ~coor.x1 + coor.x2
return(bla)
 }

 hope this helps and good luck,

 Paul

 Mark Na wrote:

 Hello,

 I am learning how to use functions, but I'm running into a roadblock.

 I would like my function to do two things: 1) convert an object to a
 dataframe, 2) and then subset the dataframe. Both of these commands work
 fine outside the function, but I would like to wrap them in a function so
 I
 can apply the code iteratively to many such objects.

 Here's what I wrote, but it doesn't work:

 convert-function(d) {
  d-data.frame(d); #convert object to dataframe
  d-subset(d,select=c(time,coords.x1,coords.x2)) #select some columns
 }
 convert(data) #the problem is that data is the same as it was before
 running the function

 The objects being processed through my function are
 SpatialPointsDataFrames
 but I'm quite sure that's not my problem, as I can process these outside
 of
 the function (using the above code) ... it's when I try to wrap the code
 in
 a function that it doesn't work.

 Thanks, Mark

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Drs. Paul Hiemstra
 Department of Physical Geography
 Faculty of Geosciences
 University of Utrecht
 Heidelberglaan 2
 P.O. Box 80.115
 3508 TC Utrecht
 Phone:  +3130 274 3113 Mon-Tue
 Phone:  +3130 253 5773 Wed-Fri
 http://intamap.geo.uu.nl/~paul http://intamap.geo.uu.nl/%7Epaul




-- 
Mark Na
University of Saskatchewan
Saskatoon, Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to apply five lines of code to ten dataframes?

2009-12-07 Thread Mark Na
Hello R-helpers,

I have 10 dataframes (named data1, data2, ... data10) and I would like to
add 5 new columns to each dataframe using the following code:

data1$LogDepth-log10(data1[,2]/data1[,4])
data1$LogArea-log10(data1[,3]/data1[,5])
data1$p-2*data1[,6]/data1[,7]
data1$Exp-data1[,2]^(2/data1[,8])
data1$s-data1[,3]/data1[,9]

...but I would prefer not to repeat this chunk of code 10 times!

I have struggled with setting up a loop to apply these 5 lines of code to
each of the 10 dataframes, but I'm not having much luck.

Any help would be much appreciated.

Thank you, Mark

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Very basic R workflow question for Windows users

2009-12-07 Thread Mark Na
Hi all,

If you use the R Editor (not another text editor), please read on...

So, my usual R workflow involves having two windows open (R Console, R
Editor), writing a line of code in the R Editor and then typing Ctrl-R to
run that line. Then, quite frequently, I want to run a new command to check
my output (e.g., dim(dataframe) or str(dataframe) or head(dataframe)) but I
*do not want* to write that command in the R Editor, rather I want to edit
it directly in the R Console.

My (basic) question is: how do you use the keyboard (not mouse) to move the
cursor from the Editor to the Console. The only (cumbersome) way I know is
to type Alt-Tab and then Tab-Tab-Tab to get to the Console.

Is there a better way?

Thanks, Mark

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Convert a list of N dataframes to N dataframes

2009-12-07 Thread Mark Na
Hello,

I have used the following command:

datalist-list(data1,data2,data3,data4,data5,data6)

to make a list of my dataframes, which I then manipulated with these
commands:

datalist-llply(datalist,LogDepth)
datalist-llply(datalist,LogArea)
datalist-llply(datalist,p)
datalist-llply(datalist,Exp)
datalist-llply(datalist,s)

This worked very nicely (thanks for plyr, Hadley) but now I would like to
unlist my list into the individual dataframes, preferably with their
original names (data1, etc).

I've tried to do this with:

 ldply(datalist,unlist)

but that's not working. Any help with this would be much appreciated.

Thanks, Mark

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to identify the rows in my dataframe with a negative value in any column?

2009-11-24 Thread Mark Na
Dear R-helpers,

I have a dataframe that should not contain any negative values, but it does.
I wish to print the rows from my dataframe that contain a negative value in
any column. I've tried this:

 dataframe[dataframe0,]

but it just returns a row of NAs.

I would very much appreciate any help with this you could provide.

Thanks, Mark

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to identify the rows in my dataframe with a negative value in any column?

2009-11-24 Thread Mark Na
Thanks Steve, this works very well!

Mark



On Tue, Nov 24, 2009 at 2:07 PM, Steve Lianoglou 
mailinglist.honey...@gmail.com wrote:

 Hi,

 On Nov 24, 2009, at 2:58 PM, Mark Na wrote:

  Dear R-helpers,
 
  I have a dataframe that should not contain any negative values, but it
 does.
  I wish to print the rows from my dataframe that contain a negative value
 in
  any column. I've tried this:
 
  dataframe[dataframe0,]
 
  but it just returns a row of NAs.
 
  I would very much appreciate any help with this you could provide.

 Imagine you had a data.frame like this:

 R df - data.frame(a=1:10, b=c(1:3,-4, 5:10), c=c(-1, 2:10))

 This will return you a boolean vector of which rows have negative values:

 R has.neg - apply(df, 1, function(row) any(row  0))

 If you want the actually index numbers:
 R which(has.neg)
 [1] 1 4

 HTH,
 -steve

 --
 Steve Lianoglou
 Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
 Contact Info: 
 http://cbio.mskcc.org/~lianos/contacthttp://cbio.mskcc.org/%7Elianos/contact




-- 
Mark Na
University of Saskatchewan
Saskatoon, Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Change positions of columns in data frame

2009-10-23 Thread Mark Na
Hi Joel,
The answers you've received already, suggesting subscripting, are good
because they strengthen your understanding of R subscripting. However,
sometimes these methods produce strange column names. So, what I usually
do is use the subset command. You don't have to provide anything for the
subset argument (i.e., you'll keep all the rows) but you can re-order the
columns by listing them in the order you want, within the select argument,
like this
DF-subset(DF,select=c(var3,var2,var1))

HTH, Mark



2009/10/23 Joel Fürstenberg-Hägg joel_furstenberg_h...@hotmail.com


 Hi all,

 Probably a simple question, but I just can't find a simple answear in the
 older threads or anywhere else.

 I've added some new vectors as columns in a data frame using cbind(). As
 they're all put as the last columns inte the data frame, I would like to
 move them to specific positions. How do you do to change the position of a
 column in a data frame?

 I know I can use
 fieldTrial0809=data.frame(Sample_ID=as.factor(fieldTrial0809$Sample_ID),
 Plant_ID=as.factor(fieldTrial0809$Plant_ID), ...) to create a new data frame
 with the given columns in the specified order, but there must be an easier
 way..?

 All the best,

 Joel

 _
 Nya Windows 7 - Hitta en dator som passar dig! Mer information.
 http://windows.microsoft.com/shop
[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Mark Na
University of Saskatchewan
Saskatoon, Canada

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] List of Windows time zones?

2009-10-20 Thread Mark Na
Hello,

I would like to adapt the following code:

 data$datetime3-format(data$datetime2, tz=EST)

for different time zones. For example, North America's CST and MDT (but
those codes don't work).

I have read ?Sys.timezone but I'm afraid it isn't very helpful. I'm just
looking for a list of the timezones, not a history of how time zones have
been dealt with throughout R's history. Previously asked messages (and
replies) on R-help provide confusing and sometimes contradictory advice.

Is there a simple list of timezones, for R on Windows? Or, can they be
calculated on the fly (but e.g., GMT+6 does not work in the above code).

Any help would be much appreciated as I am getting a bit frustrated, thanks!

Mark Na

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Make a blank dataframe with given dimensions

2009-10-07 Thread Mark Na
Hi R-helpers,

I would like to make a blank dataframe, i.e. a dataframe without any rows.

I would like the blank dataframe (which is to be called merged) to have 0
rows and 32 columns. Once I've made the dataframe, I'll specify the column
names using:


names(merged)-c(GRIDCODE,paste(VALUE,0:3,sep=_),paste(VALUE,5:30,sep=_),AREA)

Then I'll add rows to it, using the loop (which is working fine).

Thanks for any help!

Mark

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to rename columns that start with numbers?

2009-08-13 Thread Mark Na
Hello,

My dataframe has new columns that start with the number 1 or 2 (resulting
from a reshape cast command).

Instead of having these columns automatically renamed by R so start with the
letter X, I would like to rename these columns to start with the characters
SURV_ (e.g., SURV_1, SURV_2).

I can't seen to use grep() to identify and rename the columns starting with
either 1 or 2.

Any help would be much appreciated, thanks!

(I know I could rename these manually, but the above is a simpler statement
of the actual problem, which involves several dozen columns, so that's why
I'd prefer not to so it manually)

Mark Na

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Compare lm() to glm(family=poisson)

2009-07-31 Thread Mark Na
Dear R-helpers,
I would like to compare the fit of two models, one of which I fit using lm()
and the other using glm(family=poisson). The latter doesn't provide
r-squared, so I wonder how to go about comparing these
models (they have the same formula).

Thanks very much,

Mark Na

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Testing year effects in lm()

2009-07-30 Thread Mark Na
Dear R-helpers,

I have a linear model with a year effect (year is coded as a factor), i.e.
the parameter estimates for each level of my year variable have significant
P values (see some output below) and I am interested in testing:

a) the overall effect of year;
b) the significance of each year vis-a-vis every other year (the model
output only tests each year against the baseline year).

I'd appreciate any help with how to perform these post-hoc tests in R.

Many thanks,

Mark Na




Call:

lm(formula = data$SR.obs ~ log(data$AREA, 10) + data$YEAR,

subset = (data$AREA = 14.5))



Residuals:

Min  1Q  Median  3Q Max

-5.3412 -1.3140  0.1108  1.1972  4.3126



Coefficients:

   Estimate Std. Error t value Pr(|t|)

(Intercept) -9.4606 0.6144 -15.399   2e-16 ***

log(data$AREA, 10)   3.9261 0.1734  22.644   2e-16 ***

data$YEAR20081.0750 0.2854   3.767 0.000211 ***

data$YEAR20091.5884 0.3073   5.169 5.18e-07 ***

---

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1



Residual standard error: 1.822 on 226 degrees of freedom

Multiple R-squared: 0.6945, Adjusted R-squared: 0.6905

F-statistic: 171.3 on 3 and 226 DF,  p-value:  2.2e-16



[1] AIC=  934.557

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Testing year effect in lm() ***failed first time, sending again

2009-07-30 Thread Mark Na
Dear R-helpers,

I have a linear model with a year effect (year is coded as a factor), i.e.
the parameter estimates for each level of my year variable have significant
P values (see some output below) and I am interested in testing:

a) the overall effect of year;
b) the significance of each year vis-a-vis every other year (the model
output only tests each year against the baseline year).

I'd appreciate any help with how to perform these post-hoc tests in R.

Many thanks,

Mark Na



Call:

lm(formula = data$SR.obs ~ log(data$AREA, 10) + data$YEAR,

subset = (data$AREA = 14.5))



Residuals:

Min  1Q  Median  3Q Max

-5.3412 -1.3140  0.1108  1.1972  4.3126



Coefficients:

   Estimate Std. Error t value Pr(|t|)

(Intercept) -9.4606 0.6144 -15.399   2e-16 ***

log(data$AREA, 10)   3.9261 0.1734  22.644   2e-16 ***

data$YEAR20081.0750 0.2854   3.767 0.000211 ***

data$YEAR20091.5884 0.3073   5.169 5.18e-07 ***

---

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1



Residual standard error: 1.822 on 226 degrees of freedom

Multiple R-squared: 0.6945, Adjusted R-squared: 0.6905

F-statistic: 171.3 on 3 and 226 DF,  p-value:  2.2e-16



[1] AIC=  934.557

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Make my plots bigger and reduce white space around panels?

2009-07-28 Thread Mark Na
Hi,

I have made a plot with panels (attached) using R code (below) and I'd like
to increase the size of each panel and decrease the white space, especially
the white space between:

1. rows of panels
2. the top panel and its title (which contains info on r2 and N)
3. each panel and its x label.

I've dug around in the plot help files but can't seem to find how to do
this.

Any help much appreciated, thanks!

Mark Na




#ELEPHANT SPECIES RICHNESS
par(mfrow=c(3,4),oma=c(0,0,2,0))
models-list(data$SR.elephant.obs~data$AREA,
 log(data$SR.elephant.obs+1,10)~log(data$AREA,10),
 data$SR.elephant.obs~log(data$AREA,10),
 log(data$SR.elephant.obs+1,10)~data$AREA)
for (i in 1:length(models)){ #SCATTERPLOT
model-lm(models[[i]])
plot(models[[i]],ylab=Elephant SR);
abline(model);title(main=paste(r2=,round(summary(model)$r.squared,digits=3),,
N=,dim(data)[1]))
}
for (i in 1:length(models)){#RESIDUALS VS FITTED VALUES PLOT
model-lm(models[[i]])
plot.lm(model,which=1,sub.caption=NA)
}
for (i in 1:length(models)){#Q-Q PLOT
model-lm(models[[i]])
plot.lm(model,which=2,sub.caption=NA)
}
title(main=ELEPHANT SPECIES RICHNESS,outer=TRUE);
savePlot(SR_elephant.emf,type=emf); dev.off()

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Make my plots bigger and reduce white space around panels?

2009-07-28 Thread Mark Na
The plot is attached this time...

On Tue, Jul 28, 2009 at 4:47 PM, Mark Na mtb...@gmail.com wrote:

 Hi,

 I have made a plot with panels (attached) using R code (below) and I'd like
 to increase the size of each panel and decrease the white space, especially
 the white space between:

 1. rows of panels
 2. the top panel and its title (which contains info on r2 and N)
 3. each panel and its x label.

 I've dug around in the plot help files but can't seem to find how to do
 this.

 Any help much appreciated, thanks!

 Mark Na




 #ELEPHANT SPECIES RICHNESS
 par(mfrow=c(3,4),oma=c(0,0,2,0))
 models-list(data$SR.elephant.obs~data$AREA,
  log(data$SR.elephant.obs+1,10)~log(data$AREA,10),
  data$SR.elephant.obs~log(data$AREA,10),
  log(data$SR.elephant.obs+1,10)~data$AREA)
 for (i in 1:length(models)){ #SCATTERPLOT
 model-lm(models[[i]])
 plot(models[[i]],ylab=Elephant SR);
 abline(model);title(main=paste(r2=,round(summary(model)$r.squared,digits=3),,
 N=,dim(data)[1]))
 }
 for (i in 1:length(models)){#RESIDUALS VS FITTED VALUES PLOT
 model-lm(models[[i]])
 plot.lm(model,which=1,sub.caption=NA)
 }
 for (i in 1:length(models)){#Q-Q PLOT
 model-lm(models[[i]])
 plot.lm(model,which=2,sub.caption=NA)
 }
 title(main=ELEPHANT SPECIES RICHNESS,outer=TRUE);
 savePlot(SR_elephant.emf,type=emf); dev.off()




-- 
Mark Na
University of Saskatchewan
Saskatoon, Canada
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Reordering the columns of my dataframe

2009-07-27 Thread Mark Na
Hi R-helpers,

I have written this line of code:

 data-cbind(data[,1],data[,2:6],data[,18],data[,7:17])

to reorder the columns of my dataframe, but I'm losing the column names of
my 1st and 18th columns (they are now named data[,1] and data[,18]
respectively).

Can I use cbind to do this (without losing my column names) or is there
another way?

Many thanks,

Mark Na

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to perform a calculation in each element of my list?

2009-07-23 Thread Mark Na
Hi R-helpers,

I have a list containing 10 elements, each of which is a dataframe. I wish
to add a new column to each list element (dataframe) containing the product
of the last two columns of each dataframe.

I'd appreciate any pointers, thanks!

Mark Na

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to extract the upper xlim and ylim of my plot?

2009-07-21 Thread Mark Na
Dear R-helpers,

I wish to place some text in a plot, at approx 10% of my upper xlim and
approx 90% of my upper ylim, i.e.

 plot(log(all$SR,10)~log(all$AREA,10))
 text(.1*max(xlim),.9*max(ylim),text to be placed)

(I know how to give absolute coordinates for text location, but I wish to
use relative coordinates).

My code (above) doesn't work because I don't know how to properly extract
the upper xlim and ylim values.

Does anyone know how I could extract the upper xlim and ylim values (without
using max(x-variable) or max (y-variable)...I wish to keep this as general
as possible and not point to the original data.

Thanks in advance,

Mark

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Averaging dataframes that are stored in a list

2009-07-14 Thread Mark Na
Dear R-helpers,
I have a list containing 5000 elements, each element is a dataframe
containing one ID column (identical over the 5000 dataframes) and 9 numeric
variables, e.g.

ID VAR1 VAR2 VAR3 ... VAR9

I would like to create a new dataframe containing the ID column and the mean
values of the 9 numeric variables. So, the structure of this new dataframe
would be identical to the structure of the dataframes stored in my list (and
the ID column would also be identical) but the values would be mean values.

I've been attempting to do this with rowMeans and subscripting the list
using double square brackets, but I can't get it to work.

I'd appreciate any pointers to get me going, thanks!

Mark Na

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to combine two rows (in a dataframe) into a third row?

2009-07-13 Thread Mark Na
Hi Henrique  other R-helpers,
Thank you for helping me last week. I used Henrique's suggestion to develop
some code (below) to combine two rows in my dataframe into a third row, and
then delete the original two rows. It works well.

My solution is not very elegant however; if there's a function (or a better
way) to accomplish this in 1-2 lines (rather than my 6) I'd appreciate
knowing about it.

Many thanks, Mark Na


#make some data for this example

data-data.frame(c(1A,1B),c(10,15))

names(data)-c(id,value)

data$value-as.numeric(as.character(data$value))


#combine two lines into one by summing their values in the value column

fixed-data.frame() #create empty data frame to hold fixed rows

fixed-rbind(fixed, aggregate(data[value],list(substr(data[,id],1,1)),
sum))

#copy previous line as necessary for other fixes

names(fixed)-c(id,value) #fix column names


#bind the fixed line to the main dataframe and delete the original lines

data-rbind(data,fixed) #add fixed lines to data

data-data[-which(c(1A,1B) %in% data$id),] #delete lines from data

rownames(data) - 1:nrow(data) #renumber rows



On Thu, Jul 9, 2009 at 5:58 PM, Henrique Dallazuanna www...@gmail.comwrote:

 Try this:

 aggregate(x[VALUE], list(substr(x[,ID], 1, 1)), sum)

 On Thu, Jul 9, 2009 at 7:27 PM, Mark Na mtb...@gmail.com wrote:

 Dear R-helpers,

 I have two rows in my dataframe:

 IDVALUE
 1A10
 1B15

 and I would like to combine these two rows into a single (new) row in my
 dataframe:

 IDVALUE
 125

 ...simply by specifying a new value for ID and summing the two VALUES.

 I have been trying to do this with with rbind, but it's not working.

 I'd appreciate any pointers.

 Thanks, Mark Na

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Henrique Dallazuanna
 Curitiba-Paraná-Brasil
 25° 25' 40 S 49° 16' 22 O


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Randomizing a dataframe

2009-07-10 Thread Mark Na
Greg's reply was just what I needed to get me going. I used his advice to
produce a program which does just what I need. In case it helps someone
else, my program is below.

Mark Na


library(reshape)
data-read.csv(data.csv)
datam-melt(data,id=(TREE)) #value = number of individuals

datam-datam[rep(1:nrow(datam), datam$value),] #expand rows based on number
of individuals
rownames(datam) - 1:nrow(datam) #fix rownames
datam-subset(datam,select=c(TREE,variable)) #drop columns
names(datam)-c(TREE,SPECIES) #rename columns

datap-data.frame(sample(datam$TREE),datam$SPECIES) #randomly permute TREE
names(datap)-c(TREE,SPECIES) #rename columns

datat-data.frame(table(datap)) #collapse rows based on number of
individuals = Freq
datac-cast(datat,TREE~SPECIES,value=Freq) #the final permuted table



On Wed, Jul 8, 2009 at 11:28 AM, Greg Snow greg.s...@imail.org wrote:

 Here is one approach (there are others, some that are probably better, but
 this can get you started):

 1. rearrange your data so that every insect is a single row with 2 columns:
 the tree id and the species (this new dataset will have as many rows as the
 sum of the values in the old dataset).  The reshape package may be able to
 help with this step (you may also need the rep function).

 2. randomly permute one of the 2 columns (see ?sample).

 3. restructure the permuted data back to the original (the table function
 may be enough here, the reshape package will give more options).

 Hope this helps,

 --
 Gregory (Greg) L. Snow Ph.D.
 Statistical Data Center
 Intermountain Healthcare
 greg.s...@imail.org
 801.408.8111


  -Original Message-
  From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
  project.org] On Behalf Of Mark Na
  Sent: Wednesday, July 08, 2009 9:54 AM
  To: r-help@r-project.org
  Subject: [R] Randomizing a dataframe
 
  Hi R-helpers,
 
  I have a dataframe (called data) with trees in rows (n=100) and insect
  species (n=10) in columns. My tree IDs are in a column called TREE and
  each
  species has a column labeled SPEC1, SPEC2, SPEC3, etc...
 
  I wish to randomize the values in my dataframe such that row and column
  totals are held constant, i.e. in my randomized data each tree will
  have the
  same number of individual insects as in the real data (constant row
  totals)
  and each species will have the same number of individuals as in the
  real
  data (constant column totals).
 
  I will eventually want to do this many times, but I would appreciate
  help
  getting started with the randomization.
 
  Thank you, Mark Na
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-
  guide.html
  and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to combine two rows (in a dataframe) into a third row?

2009-07-09 Thread Mark Na
Dear R-helpers,

I have two rows in my dataframe:

IDVALUE
1A10
1B15

and I would like to combine these two rows into a single (new) row in my
dataframe:

IDVALUE
125

...simply by specifying a new value for ID and summing the two VALUES.

I have been trying to do this with with rbind, but it's not working.

I'd appreciate any pointers.

Thanks, Mark Na

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Randomizing a dataframe

2009-07-08 Thread Mark Na
Hi R-helpers,

I have a dataframe (called data) with trees in rows (n=100) and insect
species (n=10) in columns. My tree IDs are in a column called TREE and each
species has a column labeled SPEC1, SPEC2, SPEC3, etc...

I wish to randomize the values in my dataframe such that row and column
totals are held constant, i.e. in my randomized data each tree will have the
same number of individual insects as in the real data (constant row totals)
and each species will have the same number of individuals as in the real
data (constant column totals).

I will eventually want to do this many times, but I would appreciate help
getting started with the randomization.

Thank you, Mark Na

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to wrap my (working) code in a loop or function? (loop/function newbie alert)

2009-06-30 Thread Mark Na
Dear R-helpers,

I have split a dataframe into a list with five elements, with the following
code:

 datalist-split(data,data$UNIT)

I would now like to run some code (below) on each element of the list to
extract rows from the list elements; then I would like to rbind the
extracted rows into a new dataframe containing all of the extracted rows
from all of the list elements.

I don't need any help with the code itself, it works fine for one chunk of
data (e.g., a single dataframe). The code is:

t0-match(times$START_DT, data$DATETIME) #MAKE A VECTOR OF START TIMES
t1-match(times$STOP_DT, data$DATETIME) #MAKE A VECTOR OF STOP TIMES
indices-mapply(FUN = :, t0, t1) #MAKES A LIST, EACH ELEMENT CONTAINS
INDICES OF TIMES CORRESPONDING TO ONE WETLAND
idex-times[rep(1:nrow(times), sapply(indices, length)),
c(POND_ID,OBS,REP,PID), drop = FALSE] #MAKES A DATAFRAME
tm-data[unlist(indices), ] #FLATTENS THE LIST OF INDICES INTO A DATAFRAME
extracted-cbind(idex, tm) #BIND IDEX AND TM

But now that I've split my data into a list with five elements, what I don't
know how to do is wrap my code in a loop or function so I can run it on each
of the five list elements and then rbind the extracted rows together into a
new dataframe.

(What I have now is 5 replicates of the above code, and I would like to
replace that with a loop or function.)

I have spent all morning on this, without much progress, so would appreciate
any help you might be able to provide.

Thanks! Mark Na

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to select partially (not completely) unique rows?

2009-06-29 Thread Mark Na
Dear R-helpers,

I know how to use unique to select unique rows, e.g.

unique.rows-unique(dataframe)

but I would like to select those rows that are unique only only TWO of my
dataframe's columns (so, two rows with the same value on these two columns
would not be kept, even if they had different values in other columns).

For example, I have a dataframe with 10 columns, two of which are LATITUDE
and LONGITUDE. I wish to keep only one row per unique combination of these
two columns, so I've tried:

unique.latlong-extracted[unique(paste(extracted$latitude,extracted$longitude)),]

but this is returning a dataframe of missing values (NAs).

Could anyone point me in the right direction?

Thanks! Mark Na

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to avoid ifelse statement converting factor to character

2009-06-24 Thread Mark Na
Hi R-helpers,

Please see the below R output.
The problem is that after running the ifelse statement, data$SOCIAL_STATUS
is converted from a factor to a character.

Is there some way I can avoid this conversion?

Thanks in advance, Mark Na


 str(data)
'data.frame': 2100 obs. of  11 variables:
$ DATE   : Factor w/ 5 levels 4-Jun-09,7-May-09,..: 1 1 1 1 1 1
1 1 1 1 ...
$ POND_ID: Factor w/ 113 levels 10,18,19,..: 8 8 8 8 8 8 8 8 8
8 ...
$ STATUS : num  1 1 1 1 1 1 1 1 1 1 ...
$ SPECIES: Factor w/ 25 levels AGWT,AMCO,..: 10 10 7 7 3 5 5 5 5
2 ...
$ SOCIAL_STATUS  : Factor w/ 8 levels A,B,D,E,..: 4 1 4 1 4 4 4 4 1
6 ...
$ COUNT_OF_GROUPS: num  1 1 1 1 1 3 3 3 1 2 ...
$ MALE   : num  1 1 1 1 1 1 1 1 1 0 ...
$ FEMALE : num  1 0 1 0 1 1 1 1 0 0 ...
$ NOSEX  : num  0 0 0 0 0 0 0 0 0 2 ...
$ UPLAND : num  0 0 0 0 0 0 0 0 0 0 ...
$ TAG: num  0 0 0 0 0 0 0 0 0 0 ...

 data$SOCIAL_STATUS-ifelse(data$SOCIAL_STATUS==B  data$MALE4, C,
data$SOCIAL_STATUS)

 str(data)
'data.frame': 2100 obs. of  11 variables:
$ DATE   : Factor w/ 5 levels 4-Jun-09,7-May-09,..: 1 1 1 1 1 1
1 1 1 1 ...
$ POND_ID: Factor w/ 113 levels 10,18,19,..: 8 8 8 8 8 8 8 8 8
8 ...
$ STATUS : num  1 1 1 1 1 1 1 1 1 1 ...
$ SPECIES: Factor w/ 25 levels AGWT,AMCO,..: 10 10 7 7 3 5 5 5 5
2 ...
$ SOCIAL_STATUS  : chr  4 1 4 1 ...
$ COUNT_OF_GROUPS: num  1 1 1 1 1 3 3 3 1 2 ...
$ MALE   : num  1 1 1 1 1 1 1 1 1 0 ...
$ FEMALE : num  1 0 1 0 1 1 1 1 0 0 ...
$ NOSEX  : num  0 0 0 0 0 0 0 0 0 2 ...
$ UPLAND : num  0 0 0 0 0 0 0 0 0 0 ...
$ TAG: num  0 0 0 0 0 0 0 0 0 0 ...

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Apply as.factor (or as.numeric etc) to multiple columns

2009-06-23 Thread Mark Na
Hi R-helpers,

I have a dataframe with 60columns and I would like to convert several
columns to factor, others to numeric, and yet others to dates. Rather
than having 60 lines like this:

data$Var1-as.factor(data$Var1)

I wonder if it's possible to write one line of code (per data type,
e.g. factor) that would apply a function (e.g., as.factor) to several
(non-contiguous) columns. So, I could then use 3 or 4 lines of code
(for 3 or 4 data types) instead of 60.

I have tried writing an apply function, but it failed.

Thanks for any help you might be able to provide.

Mark Na

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with ifelse statement

2009-06-23 Thread Mark Na
Hi R-helpers,

I am trying to use this ifelse statement to recode a variable:
 data$SOCIAL_STATUS-ifelse(data$SOCIAL_STATUS==B  data$MALE4, C, B)

(i.e., if social status is B and there are more than 4 males, then
recode social status to C; otherwise, leave it B)

But, it's not working. See the below R output. Notice that there were
71 B observations before the re-code but 2098 B observations after
the re-code. The only thing my code should do is REDUCE the number of
B observations, not increase them.

Can anyone see what I'm doing wrong? Thanks!

Thanks, Mark Na


 str(data)
'data.frame':   2100 obs. of  13 variables:
 $ DATE   :Class 'Date'  num [1:2100] 14399 14399 14399 14399 14399 ...
 $ OBS: Factor w/ 7 levels AJG,LEB,MB,..: 3 3 3 3 3
3 3 3 3 3 ...
 $ POND_ID: Factor w/ 118 levels 1,10,100,..: 86 86 86
86 86 86 86 86 86 86 ...
 $ STATUS : num  1 1 1 1 1 1 1 1 1 1 ...
 $ SPECIES: Factor w/ 25 levels AGWT,AMAV,..: 16 16 12 12
4 7 7 7 7 3 ...
 $ SOCIAL_STATUS  : Factor w/ 9 levels ,A,B,D,..: 5 2 5 2 5 5
5 5 2 8 ...
 $ COUNT_OF_GROUPS: num  1 1 1 1 1 3 3 3 1 2 ...
 $ MALE   : num  1 1 1 1 1 1 1 1 1 0 ...
 $ FEMALE : num  1 0 1 0 1 1 1 1 0 0 ...
 $ NOSEX  : num  0 0 0 0 0 0 0 0 0 2 ...
 $ UPLAND : num  0 0 0 0 0 0 0 0 0 0 ...
 $ TAG: num  0 0 0 0 0 0 0 0 0 0 ...
 $ COMMENT: chr  ...


 length(which(data$SOCIAL_STATUS==B))
[1] 71

 data$SOCIAL_STATUS-ifelse(data$SOCIAL_STATUS==B  data$MALE4, C, B)

 length(which(data$SOCIAL_STATUS==B))
[1] 2098

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SAS-like method of recoding variables?

2009-06-22 Thread Mark Na
Dear R-helpers,

I am helping a SAS user run some analyses in R that she cannot do in
SAS and she is complaining about R's peculiar (to her!) way of
recoding variables. In particular, she is wondering if there is an R
package that allows this kind of SAS recoding:

IF TYPE='TRUCK' and count=12 THEN VEHICLES=TRUCK+((CAR+BIKE)/2.2);

Thanks for any help or suggestions you might be able to provide!

Mark Na

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Calculating row standard deviations

2009-06-22 Thread Mark Na
Hi R-helpers,

I have been struggling with calculating row and column statistics,
e.g. standard deviation.

I know that
 datac$Mean-rowMeans(datac,na.rm=TRUE)
will give me row means.

I have tried to replicate those row means with the apply function:
 datac$Mean2-apply(datac,2,mean)

so that I can replace the function argument with sd (instead of
mean) to get standard deviations.

But, I'm running into this error:

 dim(datac)
[1]  17 271
 datac$Mean2-apply(datac,2,mean)
Error in dimnames(x) - dn :
  length of 'dimnames' [2] not equal to array extent


Can anyone see what I'm doing wrong?

Thanks!

Mark Na

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to change ONLY the first character of each variable name

2009-06-18 Thread Mark Na
Dear R-helpers,

I would like to adapt the following code

 names(data)-sub(M,MOLE,names(data))

which changes any occurrence of M (in my variable names) to MOLE

such that it ONLY operates on the first character of each variable
name, i.e. M will only be changed to MOLE if it's the first character
of a variable.

I would appreciate any help you might provide. Thanks!

Mark Na

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to translate a dataframe into the R code that makes that dataframe?

2009-06-17 Thread Mark Na
Hi,

I am helping another R user (off list) and I would like to email her
an R script containing the data she needs and the code to solve her
problem. I have made a small dummy dataset, but instead of sending her
a CSV I would prefer to send the data embedded in the script, so there
would be a like in the script like:

my.df-c( etc, etc, etc

I have made the dataframe (in a spreadsheet) and imported it into R
(using read.csv) and now I wonder if there is a function to produce
the code that makes that dataframe.

Thanks for any help you can provide.

Mark Na

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to read in only some columns of a data file

2009-06-17 Thread Mark Na
Use the colClasses argument in combination with NULL for the columns you
don't want to read. For example, this code reads the first column as
character data and does not import the remaining 10 columns:

yourdata-read.csv(yourdata.csv,colClasses=c(rep(character,3),rep(NULL,10)))

HTH, Mark Na


On Wed, Jun 17, 2009 at 2:59 PM, liujb liujul...@yahoo.com wrote:


 Hello,

 I have a data file (.csv) that has a size of about 2.6 GB. I am not able to
 read in the whole data set because of the memory limit. I actually only
 need
 some columns (3 columns) of the data set, is there a way to read in
 specified columns?

 I am using windows.

 Thanks,
 Julia
 --
 View this message in context:
 http://www.nabble.com/how-to-read-in-only-some-columns-of-a-data-file-tp24081974p24081974.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to extract all rows that contain the value of X in any column?

2009-06-16 Thread Mark Na
Hi R-helpers,

I'm trying to use this code

 pvh_dnv-pvh[sapply(pvh==dnv),]

to make a new dataframe containing the rows from pvh that contain the
value of dnv in ANY column.

But, it's not working. I get this error

Error in match.fun(FUN) : element 1 is empty;
   the part of the args list of 'is.function' being evaluated was:
   (FUN)

which, to me, is cryptic.

I'd appreciate any help you might provide, thanks!

Mark Na

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to subset my dataframe? (a bit tricky)

2009-06-16 Thread Mark Na
Hi R-helpers,

I would like to subset my dataframe, keeping only those rows which
satisfy the following conditions:

1) the string dnv is found in at least one column;
2) the value in the column previous to the one dnv is found in is not 0

Here's what my data look like:

    POND_ID 2009-05-07 2009-05-15 2009-05-21 2009-05-28 2009-06-04

4       101       0.15          0        dnv        dnv        dnv
7       102          0        dnv        dnv        dnv        dnv
87      103       0.15        dnv          1          1          1
99      104        dnv       0.25          1          1       0.75

So, for above example, the new dataframe would not contain POND_ID 101
or 102 (because there is a 0 before the dnv) but it WOULD contain
POND_ID 103 (because there is a 0.15 before the dnv) and 104 (because
dnv occurs in the first column, so cannot be preceded by a 0).

One extra twist: I would like to retain rows in the new dataframe
which satisfy the above conditions even if they also have a 0 then
dnv sequence preceding or following the problem , e.g., the
following rows would be retained in the new dataframe

   POND_ID 2009-05-07 2009-05-15 2009-05-21 2009-05-28 2009-06-04

100     105       0.15        dnv          1          0        dnv
101     106       0           dnv          1          0.15     dnv

Thanks in advance for any help you might provide.

(I hope I've provided enough of an example; I could also provide a
.csv file if that would help.)

Mark Na

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to fix my nested conditional IF ELSE code?

2009-06-14 Thread Mark Na
Hi,
I've been struggling most of the morning with an IF ELSE problem, and I
wonder if someone might be able to sort me out.

Here's what I need to do (dummy example, my data are more complicated):

If type = A or B or C
 and status = a then count = 1
 and status = b then count = 2
 and status = c then count = 3

Else if type = D or E or F
 and status = a then count = 9
 and status = b then count = 8
 and status = c then count = 7

End

Seems simple when I write it like that, but the R code is escaping me.

Thanks!

Mark Na

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to fix my nested conditional IF ELSE code?

2009-06-14 Thread Mark Na
Thanks Gabor, this is quite clever and it's nice to see another way of doing
it (without ifelse).
Mark


On Sun, Jun 14, 2009 at 6:51 PM, Gabor Grothendieck ggrothendi...@gmail.com
 wrote:

 Note that TRUE and FALSE become 1 and 0 when used in arithmetic
 formulae so:

 result - with(DF,
(type %in% c(A, B, C)) *
(1 * (status == a) + 2 * (status == b) + 3 * (status ==
 c)) +
(type %in% c(D, E, F)) *
(9 * (status == a) + 8 * (status == b) + 7 * (status ==
 c)))

 If none of the conditions hold for row i then result[i] will be 0.


 On Sun, Jun 14, 2009 at 6:18 PM, Mark Namtb...@gmail.com wrote:
  Hi,
  I've been struggling most of the morning with an IF ELSE problem, and I
  wonder if someone might be able to sort me out.
 
  Here's what I need to do (dummy example, my data are more complicated):
 
  If type = A or B or C
  and status = a then count = 1
  and status = b then count = 2
  and status = c then count = 3
 
  Else if type = D or E or F
  and status = a then count = 9
  and status = b then count = 8
  and status = c then count = 7
 
  End
 
  Seems simple when I write it like that, but the R code is escaping me.
 
  Thanks!
 
  Mark Na
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Expand a contingency table based on the value in one column

2009-06-11 Thread Mark Na
Hi R-helpers,

I have the following (dummy) dataframe:

 test
  DATE LOCATION  KIND CLASS COUNT
111   CAR A 2
211 TRUCK D 3
311   BUS E 4
412   CAR E 2
512 TRUCK A 7
612   BUS F 1

That I would like to turn into this:

 test2
   DATE LOCATION  KIND CLASS
1 11   CAR A
2 11   CAR A
3 11 TRUCK D
4 11 TRUCK D
5 11 TRUCK D
6 11   BUS E
7 11   BUS E
8 11   BUS E
9 11   BUS E
1012   CAR E
1112   CAR E
1212 TRUCK A
1312 TRUCK A
1412 TRUCK A
1512 TRUCK A
1612 TRUCK A
1712 TRUCK A
1812 TRUCK A
1912   BUS F

So, basically it's a case of expanding (adding rows to) the first dataframe
by the value in the COUNT column.

I have solved this problem with the following code:

test2-with(test, data.frame(DATE=rep(DATE,COUNT),
LOCATION=rep(LOCATION,COUNT), KIND=rep(KIND,COUNT), CLASS=rep(CLASS,COUNT)))

but I'm unsatisfied with that solution because it's verbose and I think
there must a more elegant way. If I had more variables than 4 (which I do in
my real data) it would be a nuisance to repeat each column within the rep
function.

I would prefer to do this with Base R or package(reshape) than relying on
another package.

Any ideas? Thanks!

Mark Na

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Specifying data type when creating a dataframe using RODBC

2009-05-12 Thread Mark Na
H R-helpers,

I am using the following code to make a dataframe from an Excel spreadsheet:

library(RODBC)
channel - odbcConnectExcel(Spreadsheet.xls)
Data - sqlFetch(channel, Tab1)
odbcClose(channel)

One column (several, actually) in the spreadsheet contains integers in
its first few rows but later values in these columns contain a mixture
of numbers, letters and symbols (it's an ID variable, containing e.g.,
12, 14, 19, 19B, 19C, 19/20)

R creates this column as a numeric variable (I think because its first
few variables are numbers) but as soon as R gets to the non-numeric
values (e.g., 19/20) it replaces them with NA.

So, my question is: how can I specify that certain columns are to be
read as character variables BEFORE the dataframe is created?

I have tried using as.character() in the third line (above) but it
creates a very long first column containing all of my data...

Thanks for any help you might provide,

Mark Na

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Read many .csv files into an R session automatically, without specifying filenames

2009-05-11 Thread Mark Na
Hi R-helpers,

I would like to read into R all the .csv files that are in my working
directory, without having to use a read.csv statement for each file.

Each .csv would be read into a separate dataframe which would acquire
the filename of the .csv.

As an example:

Mark-read.csv(Mark.csv)

...but the code (or command) would do this automatically for every
.csv in the working directory without my specifying each file.

I'd appreciate any help or ideas you might have.

Thanks!

Mark Na

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Set working directory by dragging text file onto R shortcut? [WinXP, unfortunately]

2009-05-11 Thread Mark Na
Hi R-helpers,

I must use WinXP at work, and I'm missing a particular feature that's
available in R on my Mac at home...

In WinXP I would like to launch R by dragging and releasing a text
file on top of the R shortcut on my desktop. Then, I would like the
working directory to be automatically set to the location of that text
file.

That's what works in MacOS, but I can't figure out if it's possible in WinXP.

Any ideas? I'd appreciate it...

Thanks, Mark Na

PS What I do now is make a new R shortcut for each project, store that
shortcut in the directory containing that project's datafiles and R
programs (which are in text files), and set the Start in value to
that directory...but this has to be set up uniquely for each
project...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Relative subscripts?

2009-05-05 Thread Mark Na
Dear R-helpers,

I have a dataframe with several columns, one of which is called LENGTH.

I would like to make a new column called DIFF containing the value of
LENGTH minus LENGTH in the previous row, like this:

  ID LENGTH DIFF
1  1 10   NA
2  2 155
3  3 205
4  4 12   -8
5  5 186

I'd like to think there are relative subscripts in R but I can't
find any reference to such a thing.

Any help solving this problem would be much appreciated, thanks!

Mark Na

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Please help me subset this dataframe, thanks...

2009-05-03 Thread Mark Na
Dear R-helpers,

I have a dataframe called trackpoints with several columns including
a column called time, eg:

 trackpoints
        time
1   12:00:00
2   12:00:01
3   12:00:02
.
.
.
298 12:04:57
299 12:04:58
300 12:04:59

I also have a dataframe called data that contains columns called
ID, start and stop, eg:

 data
  ID    start     stop
1  1 12:00:00 12:01:30
2  2 12:02:16 12:03:01
3  3 12:03:58 12:04:31

I wish to make a dataframe called extracted containing only the rows
in trackpoints with a value of time bounded by the times in
data$start and data$stop and a column called ID containing the value
from data$ID, eg:

 extracted
        Time ID
1   12:00:00  1
2   12:00:01  1
3   12:00:02  1
.
.
.
89  12:01:28  1
90  12:01:29  1
91  12:01:30  1

I have the vague notion that I might have to loop this, but I think it
would be cleaner to use logical subscripts, if possible.
I'd appreciate any help you might be able to provide.

Thanks! Mark Na

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to summarise several models in a single table

2009-03-14 Thread Mark Na
Dear R-helpers,

I have produced several models, named model1, model2, model3, etc...

I would like to extract several elements from each model's object, e.g. at
minimum the estimates, SEs, and P values of each model's intercept and
slopes, model R-squared, and AIC...

...and then produce a new object (a table) that summarises all of my models,
with M\models in rows and extractd model elements in columns.

Before reinventing the wheel, I wonder if there is a package or function
that does what I need?

Thank you!

Mark Na

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Logical subset of the columns in a dataframe

2009-01-28 Thread Mark Na
Hi R-helpers,

I've been struggling with a problem for most of the day (!) so am finally
resorting to R-help.

I would like to subset the columns of my dataframe based on the frequency
with which the columns contain non-zero values. For example, let's say that
I want to retain only those columns which contain non-zero values in at
least 1% of their rows.

In Excel I would calculate a row at the bottom of my data sheet and use the
following function

=countif(range,0)

to identify the number of non-zero cells in each column. Then, I would
divide that by the number of rows to obtain the frequency of non-zero values
in each column. Then, I would delete those columns with frequencies  0.01.

But, I'd like to do this in R. I think the missing link is an analog to
Excel's countif function. Any ideas?

Thanks! Mark

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to add a data line (series) to a plot using add=TRUE

2008-10-28 Thread Mark Na

Hello,

I'd like to use the add=TRUE parameter to add a second data line 
(series) to an existing plot, but R is giving me an error (see below).


This code:
 rap-plot(aspen_sort,ylim=c(1,1),log=y)

...produces the plot to which I'd like to add the second line. But this 
code:

 rap-plot(pine_sort,add = TRUE)

...produces this error:

Warning messages:
1: In plot.window(...) : add is not a graphical parameter
2: In plot.xy(xy, type, ...) : add is not a graphical parameter
3: In axis(side = side, at = at, labels = labels, ...) :
 add is not a graphical parameter
4: In axis(side = side, at = at, labels = labels, ...) :
 add is not a graphical parameter
5: In box(...) : add is not a graphical parameter
6: In title(...) : add is not a graphical parameter

I have successfully used add=TRUE (in the same program!) with no errors, 
so I reckon the problem must be related to my data structures, but I 
can't see how.


Any ideas? Thanks!

Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to update a column in a dataframe, more simply...

2008-09-26 Thread Mark Na

Hello,

I would like to be able to update an existing column in a dataframe, 
like this...


data$score[data$type==1  data$year==2001]-data$score * 0.111
data$score[data$type==1  data$year==2002]-data$score * 0.222
data$score[data$type==1  data$year==2003]-data$score * 0.333

...but, if possible, using simpler code. I've got several dozen lines of 
code like this (type 2, type3, etc. for the same years) so it would be 
great if I could reduce each set of three lines of code to one line


Any help much appreciated, thanks!

Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to order some of my columns (not rows) alphabetically

2008-09-25 Thread Mark Na

Hello,

I have a dataframe with 9 columns, and I would like to sort (order) the 
right-most eight of them alphabetiaclly, i.e.:


ID1 ID2 F G A B C E D

would become

ID1 ID2 A B C D E F G

Right now, I'm using this code:

attach(data)
data-data.frame(ID1,ID2,data[,sort(colnames(data)[3:9])])
detach(data)

but that's not very elegant. Ideally I could specify which columns to 
sort and which to leave as is (but my attempts to do so have failed).


Thank you,

Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How can I comment out whole chunks of code?

2008-09-24 Thread Mark Na

Hello,

I know this has been discussed, but I haven't found an answer in the 
archives. Basically, I'd like to be able to comment out chunks of code 
(which may or may not be syntactically correct) without having to put 
the # symbol in front of each line (and, if possible, without having to 
adopt a new text editor).


My current R setup (XP) is very simple. I always have three windows 
open: the R console, my working directory, and a Notepad window 
containing my program. Adopting Tinn-R would probably solve this 
problem, but for simplicity I'd rather not move beyond Notepad (if 
possible).


Thanks for any help you can provide,

Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Multiple logical operations in a subscript

2008-09-18 Thread Mark Na

Hello,

I would like to select cases using multiple logical operations (e.g. X 
or Y or Z) without having to repeat the dataframe$variable within the 
subscript. My working code (with a single logical operator) currently 
looks like this:


dataframe$newvariable[data$oldvariable==X]-group1

I thought this next line of code might do what I wanted, but it doesn't:

dataframe$newvariable[data$oldvariable==X | Y | Z]-group1

I'd appreciate any suggestions. I've tried playing around with grep, but 
can't make it work.


Thanks! Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Turn of save workspace prompt (WinXP, R 2.7.2)

2008-09-16 Thread Mark Na

Hi,

I'd like R to no longer prompt me to save my workspace every time I 
quit. I seem to recall seeing this option (maybe in the OS X console?) 
but I can't seem to find it in WinXP. Can anyone help?


Thanks! Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] k-sample Kolmogorov-Smirnov test?

2008-09-14 Thread Mark Na
Hello, I would like to conduct a k-sample K-S test, but cannot find
reference to its implementation in R. Does anyone have experience with this?
Thanks, Mark

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] xls to csv conversion via WinXP's context menu?

2008-09-05 Thread Mark Na
Frequently I need to convert a .xls to a .csv (for import into R) and I 
do this by opening the file in excel and saving it as a csv. I would 
rather do this using WinXP's context menu (right click the xls, choose 
convert to csv) but I don't know of a utility that does this. Any 
ideas? Thanks, Mark


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How many parameters does my model (gls) have?

2008-08-26 Thread Mark Na

Hello,

Is there a way to output the number of parameters in my model (gls)? I 
can count the number of estimates, but I'd like to use the number of 
parameters as an R object.


Thanks, Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to run a model 1000 times, while saving coefficients each time?

2008-08-25 Thread Mark Na

Hello,

We have written a program (below) to model the effect of a covariate on 
observed values of a response variable (using only 80% of the rows in 
our dataframe) and then use that model to calculate predicted values for 
the remaining 20% of the rows. Then, we compare the observed vs. 
predicted values using a linear model and inspect that model's 
coefficients and its R2 value.


We wish to run this program 1000 times, and to save the coefficients and 
R2 values into a separate dataframe called results.


We have a looping structure (also below) but we do not know how to save 
the coefficients and R2 values. We are missing some code (indicated)


Any assistance would be greatly appreciated.

Thanks,


library(sampling)

mall-read.csv(mall.csv)

for (j in 1:1000) {

s-srswor(2840,3550)
mall80-mall[s==1,]
mall20-mall[s==0,]
model1-lm(count~habitat,data=mall80)
summary(model1)
mall20$predicted-predict(model1,newdata=mall20)
model2-lm(count~predicted,data=mall20)

MISSING CODE: SAVE MODEL COEFFICIENTS AND R2 VALUE TO A DATAFRAME CALLED 
RESULTS


}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] A repeated measures, linear mixed model (lme) WITHOUT random effects...

2008-08-25 Thread Mark Na

Hello,

I am trying to fit a repeated measures linear mixed model (using lme) 
but I don't want to include any random effects. I'm having trouble (even 
after consulting Pinheiro  Bates 2000) figuring out how to specify the 
repeated measure without including it in the specification of a random 
effect.


My data consist of repeated counts in plots that I wish to model as 
a function of habitat. This attempt:


 model-lme(count~habitat-1,data=dataframe,method=ML)

doesn't consider the repeated nature of my counts (i.e., that there are 
multiple rows in my dataframe for each plot). I know how to 
includeplot as a random effect, but I don't wish to do that, and I 
can't see how to include it without doing so.


I'll appreciate any help you can provide.

Thanks, Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lme questions re: repeated measures covariance structure

2008-08-22 Thread Mark Na
Hello,

We are attempting to use nlme to fit a linear mixed model to explain bird
abundance as a function of habitat:


lme(abundance~habitat-1,data=data,method=ML,random=~1|sampleunit)

The data consist of repeated counts of birds in sample units across multiple
years, and we have two questions:

1) Is it necessary (and, if so, how) to specify the repeated measure
(years)? As written, the above code does not.

2) How can we specify a Toeplitz heterogeneous covariance structure for this
model? We have searched the help file for lme, and the R-help archives, but
cannot find any pertinent information. If that's not possible, can we adapt
an existing covariance structure, and if so how?

Thanks, Mark

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lme questions re: repeated measures covariance structure

2008-08-22 Thread Mark Na
Great, thanks for your reply!

I've just tracked down a copy of Pinheiro  Bates (2000) so I'll look at
that, too.

Thanks again, Mark


On Fri, Aug 22, 2008 at 9:56 AM, Christoph Scherber 
[EMAIL PROTECTED] wrote:

 Dear Mark,

 I would include the repeated measure as the smallest stratum in the random
 effects specification:

 random=~1|sampleunit/year

 Setting up user-defined variance structures should be possible using for
 example:

 weights=varPower(form=~habitat)

 or also try out the available corStruct() classes (found in Pinheiro and
 Bates 2000)

 HTH
 Christoph




 Mark Na schrieb:

 Hello,

 We are attempting to use nlme to fit a linear mixed model to explain bird
 abundance as a function of habitat:


 lme(abundance~habitat-1,data=data,method=ML,random=~1|sampleunit)

 The data consist of repeated counts of birds in sample units across
 multiple
 years, and we have two questions:

 1) Is it necessary (and, if so, how) to specify the repeated measure
 (years)? As written, the above code does not.

 2) How can we specify a Toeplitz heterogeneous covariance structure for
 this
 model? We have searched the help file for lme, and the R-help archives,
 but
 cannot find any pertinent information. If that's not possible, can we
 adapt
 an existing covariance structure, and if so how?

 Thanks, Mark

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 .


 --
 Dr. rer.nat. Christoph Scherber
 University of Goettingen
 DNPW, Agroecology
 Waldweg 26
 D-37073 Goettingen
 Germany

 phone +49 (0)551 39 8807
 fax   +49 (0)551 39 8806

 Homepage http://www.gwdg.de/~cscherb1 http://www.gwdg.de/%7Ecscherb1


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.