[R] creating bins for a plot

2006-10-18 Thread Jeffrey Stratford
Hi.  I'm trying to plot the ratio of used versus unused bird houses
(coded 1 or 0) versus a continuous environmental gradient (proportion of
 urban cover [purban2]) that I would like to convert into bins (0 -
0.25, 0.26 - 0.5, 0.51 - 0.75, 0.76 - 1.0) and I'm not having much luck
figuring this out.  I ran a logistic regression and purban2 ends up
driving the probability of a box being occupied so it would be nice to
show this relationship.  I'm also plotting the fitted values vs. purban2
but that's done.  Any suggestions would be appreciated.

Many thanks,

Jeff


Data sample:

box use purbank purban2
1   1   0.003813435 0.02684564
2   1   0.04429451  0.1610738
3   1   0.04458785  0.06040268
4   1   0.06072162  0.2080537
5   0   0.6080962   0.6979866
6   1   0.6060428   0.6107383
7   1   0.3807568   0.4362416
8   0   0.3649164   0.3154362
9   0   0.3505427   0.2483221
10  0   0.3476093   0.1409396
11  0   0.3719566   0.3020134
12  1   0.09238011  0.1342282
13  0   0.08616111  0.1073826
14  0   0.07388724  0.04026845
15  1   0.07046477  0.03355705
.
.
.



Jeffrey A. Stratford, Ph.D.
Postdoctoral Associate
331 Funchess Hall
Department of Biological Sciences
Auburn University
Auburn, AL 36849
334-329-9198
FAX 334-844-9234
http://www.auburn.edu/~stratja

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] nontabular logistic regression

2006-10-13 Thread Jeffrey Stratford
Hi.  I'm attempting to fit a logistic/binomial model so I can determine
the influence of landscape on the probability that a box gets used by a
bird.  I've looked at a few sources (MASS text, Dalgaard, Fox and
google) and the examples are almost always based on tabular predictor
variables.  My data, however are not.  I'm not sure if that is the
source of the problems or not because the one example that includes a
continuous predictor looks to be coded exactly the same way.  Looking at
the output, I get estimates for each case when I should get a single
estimate for purbank.  Any suggestions?

Many thanks,

Jeff


THE DATA: (200 boxes total, used [0 if unoccupied, 1 occupied], the rest
are landscape variables).  

box use purbank purban2 purban1 pgrassk pgrass2 pgrass1 grassdist   
grasspatchk
1   1   0.003813435 0.02684564  0.06896552  0.3282487   
0.6845638   0.7586207   0   3.73
2   1   0.04429451  0.1610738   0.1724138   0.1534174   
0.3825503   0.6551724   0   1.023261
3   1   0.04458785  0.06040268  0   0.1628043   
0.5570470.7586207   0   0.9605769
4   1   0.06072162  0.2080537   0.06896552  0.01936052  
0   0   323.10990.2284615
5   0   0.6080962   0.6979866   0.6896552   0.03168084  
0.1275168   0.2413793   30  0.2627027
6   1   0.6060428   0.6107383   0.3448276   0.04077442  
0.2885906   0.4482759   30  0.2978571
7   1   0.3807568   0.4362416   0.6896552   0.06864183  
0.03355705  0   94.868330.468
8   0   0.3649164   0.3154362   0.4137931   0.06277501  
0.1275168   0   120 0.4585714

THE CODE:

box.use- read.csv(c:\\eabl\\2004\\use_logistic2.csv, header=TRUE)
attach(box.use)
box.use - na.omit(box.use)
use - factor(use, levels=0:1)
levels(use) - c(unused, used)
glm1 - glm(use ~ purbank, binomial)

THE OUTPUT:

Coefficients:
 Estimate Std. Error   z value Pr(|z|)
(Intercept)-4.544e-16  1.414e+00 -3.21e-161.000
purbank02.157e+01  2.923e+04 0.0010.999
purbank0.001173365  2.157e+01  2.067e+04 0.0010.999
purbank0.001466706  2.157e+01  2.923e+04 0.0010.999
purbank0.001760047  6.429e-16  2.000e+00  3.21e-161.000
purbank0.002346729  2.157e+01  2.923e+04 0.0010.999
purbank0.003813435  2.157e+01  2.923e+04 0.0010.999
purbank0.004106776  2.157e+01  2.067e+04 0.0010.999
purbank0.004693458  2.157e+01  2.067e+04 0.0010.999



Jeffrey A. Stratford, Ph.D.
Postdoctoral Associate
331 Funchess Hall
Department of Biological Sciences
Auburn University
Auburn, AL 36849
334-329-9198
FAX 334-844-9234
http://www.auburn.edu/~stratja

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] nontabular logistic regression

2006-10-13 Thread Jeffrey Stratford
Gavin,

That worked!  I went through and I found a few missing cases where I had
. instead of NA - I'm still in SAS mode. 

Many thanks!



Jeffrey A. Stratford, Ph.D.
Postdoctoral Associate
331 Funchess Hall
Department of Biological Sciences
Auburn University
Auburn, AL 36849
334-329-9198
FAX 334-844-9234
http://www.auburn.edu/~stratja

 Gavin Simpson [EMAIL PROTECTED] 10/13/06 11:23 AM 
On Fri, 2006-10-13 at 09:28 -0500, Jeffrey Stratford wrote:
 Hi.  I'm attempting to fit a logistic/binomial model so I can
determine
 the influence of landscape on the probability that a box gets used by
a
 bird.  I've looked at a few sources (MASS text, Dalgaard, Fox and
 google) and the examples are almost always based on tabular predictor
 variables.  My data, however are not.  I'm not sure if that is the
 source of the problems or not because the one example that includes a
 continuous predictor looks to be coded exactly the same way.  Looking
at
 the output, I get estimates for each case when I should get a single
 estimate for purbank.  Any suggestions?
 
 Many thanks,
 
 Jeff

Hi Jeff,

using the snippet of data you provided (copy/paste into a text file and
read in with read.table) worked fine:

box.use - read.table(~/tmp/tmp.txt, header = TRUE)
box.use
str(box.use)
'data.frame':   8 obs. of  10 variables:
 $ box: int  1 2 3 4 5 6 7 8
 $ use: int  1 1 1 1 0 1 1 0
 $ purbank: num  0.00381 0.04429 0.04459 0.06072 0.60810 ...
 $ purban2: num  0.0268 0.1611 0.0604 0.2081 0.6980 ...
 $ purban1: num  0.069 0.172 0.000 0.069 0.690 ...
 $ pgrassk: num  0.3282 0.1534 0.1628 0.0194 0.0317 ...
 $ pgrass2: num  0.685 0.383 0.557 0.000 0.128 ...
 $ pgrass1: num  0.759 0.655 0.759 0.000 0.241 ...
 $ grassdist  : num0   0   0 323  30 ...
 $ grasspatchk: num  3.730 1.023 0.961 0.228 0.263 ...

Now I don't like attach, and you just don't need it so I deviate a
little now. Replace box.use$use directly and make use of the data
argument in glm. Also, your data didn't have any missing data so I'm not
sure whether the response or predictor is missing and whether your
na.omit is needed or not - I don't use it below.

box.use$use - factor(box.use$use, levels=0:1)
levels(box.use$use) - c(unused, used)
box.use
str(box.use)
glm1 - glm(use ~ purbank, data = box.use, family = binomial())

summary(glm1)

Call:
glm(formula = use ~ purbank, family = binomial(), data = box.use)

Deviance Residuals:
 Min1QMedian3Q   Max
-1.61450  -0.03098   0.31935   0.45888   1.39194

Coefficients:
Estimate Std. Error z value Pr(|z|)
(Intercept)3.223  2.225   1.4480.147
purbank   -6.129  4.773  -1.2840.199

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 8.9974  on 7  degrees of freedom
Residual deviance: 6.5741  on 6  degrees of freedom
AIC: 10.574

Number of Fisher Scoring iterations: 5


I suspect something got messed up in your reading of the data and R
thought purbank was a factor or character. Always check your data after
reading in, and str() is a your friend here as printed representations
are not always what they seem.

HTH

G

 
 
 THE DATA: (200 boxes total, used [0 if unoccupied, 1 occupied], the
rest
 are landscape variables).  
 

box use purbank purban2 purban1 pgrassk pgrass2 pgrass1 grassdist   
grasspatchk

1   1   0.003813435 0.02684564  0.06896552  0.3282487   
0.6845638   0.7586207   0   3.73

2   1   0.04429451  0.1610738   0.1724138   0.1534174   
0.3825503   0.6551724   0   1.023261

3   1   0.04458785  0.06040268  0   0.1628043   
0.5570470.7586207   0   0.9605769

4   1   0.06072162  0.2080537   0.06896552  0.01936052  
0   0   323.10990.2284615

5   0   0.6080962   0.6979866   0.6896552   0.03168084  
0.1275168   0.2413793   30  0.2627027

6   1   0.6060428   0.6107383   0.3448276   0.04077442  
0.2885906   0.4482759   30  0.2978571

7   1   0.3807568   0.4362416   0.6896552   0.06864183  
0.03355705  0   94.868330.468

8   0   0.3649164   0.3154362   0.4137931   0.06277501  
0.1275168   0   120 0.4585714
 
 THE CODE:
 
 box.use- read.csv(c:\\eabl\\2004\\use_logistic2.csv, header=TRUE)
 attach(box.use)
 box.use - na.omit(box.use)
 use - factor(use, levels=0:1)
 levels(use) - c(unused, used)
 glm1 - glm(use ~ purbank, binomial)
 
 THE OUTPUT:
 
 Coefficients:
  Estimate Std. Error   z value Pr(|z|)
 (Intercept)-4.544e-16  1.414e+00 -3.21e-161.000
 purbank02.157e+01  2.923e+04 0.0010.999
 purbank0.001173365  2.157e+01  2.067e+04 0.0010.999
 purbank0.001466706

[R] glm with nesting

2006-10-05 Thread Jeffrey Stratford
I just had a manuscript returned with the biggest problem being the
analysis.  Instead of using principal components in a regression I've
been asked to analyze a few variables separately. So that's what I'm
doing.

I pulled a feather from young birds and we quantified certain aspects of
the color of those feathers.  Since I often have more than one sample
from a nest, I thought I should use a nested design. 

Here's the code I've been using and I'd appreciate if someone could look
it over and see if it was correct. 

bb.glm1 - glm(rtot ~ box/(julian +purbank), data=bbmale,
family=gaussian, na.action=na.omit)

where rtot = total reflectance, box = nest box (i.e., birdhouse), 
julian = day of the year and purbank = the proportion of urban cover in
a 1 km buffer around the nest box.  I'm not interested in the box effect
and I've seperated males and female chicks.   

 I've asked about nestedness before and I was given code that included
| to indicate nestedness but this indicates a grouping does it not?  I
suspect that there is something wrong.  In the summary I get 

Coefficients:
  Estimate Std. Error t value Pr(|t|)
(Intercept)  2.880e-01  3.224e-03  89.322   2e-16 ***
box -3.219e-05  6.792e-05  -0.4740.636
box:julian   7.093e-08  3.971e-07   0.1790.859
box:purbank -1.735e-05  1.502e-04  -0.1150.908   

The other question I have is how do I test a null hypothesis - no
explanatory variables?  [rtot ~ NULL?]

Many thanks,

Jeff 




Jeffrey A. Stratford, Ph.D.
Postdoctoral Associate
331 Funchess Hall
Department of Biological Sciences
Auburn University
Auburn, AL 36849
334-329-9198
FAX 334-844-9234
http://www.auburn.edu/~stratja

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] glm with nesting

2006-10-05 Thread Jeffrey Stratford
Peter and list,

Thanks for the response.  A did add box as a factor (box -
factor(box)).  Julian should be linear - bluebird chicks are bluer as
the season progresses from March to August.  

I did try the following

rtot.lme - lmer(rtot ~ sex +(purban|box:chick) + (purban|box), data=bb,
na.action=na.omit) # from H. Doran

and

rtot.lme2 - lme(fixed=rtot ~ sex + purban + sexv:purban, data = bb,
random = ~1 |box) # from K. Jones [EMAIL PROTECTED]

but these did not work (months ago and I don't remember exactly why) and
I have since seperated males and females and added day of the year
(julian).  But | does indicate grouping not nested, correct?

Could someone suggest some coding that might work?

Thanks again,

Jeff


 Peter Dalgaard [EMAIL PROTECTED] 10/05/06 7:14 AM 
Jeffrey Stratford [EMAIL PROTECTED] writes:

 I just had a manuscript returned with the biggest problem being the
 analysis.  Instead of using principal components in a regression I've
 been asked to analyze a few variables separately. So that's what I'm
 doing.
 
 I pulled a feather from young birds and we quantified certain aspects of
 the color of those feathers.  

 Since I often have more than one sample
 from a nest, I thought I should use a nested design. 

Notwithstanding comments below, that quote could be aiming for the
fortunes package... 

 
 Here's the code I've been using and I'd appreciate if someone could look
 it over and see if it was correct. 
 
 bb.glm1 - glm(rtot ~ box/(julian +purbank), data=bbmale,
 family=gaussian, na.action=na.omit)
 
 where rtot = total reflectance, box = nest box (i.e., birdhouse), 
 julian = day of the year and purbank = the proportion of urban cover in
 a 1 km buffer around the nest box.  I'm not interested in the box effect
 and I've seperated males and female chicks.   
 
  I've asked about nestedness before and I was given code that included
 | to indicate nestedness but this indicates a grouping does it not?  I
 suspect that there is something wrong.  In the summary I get 
 
 Coefficients:
   Estimate Std. Error t value Pr(|t|)
 (Intercept)  2.880e-01  3.224e-03  89.322   2e-16 ***
 box -3.219e-05  6.792e-05  -0.4740.636
 box:julian   7.093e-08  3.971e-07   0.1790.859
 box:purbank -1.735e-05  1.502e-04  -0.1150.908   

Several things look wrong here. 

Most importantly, you appear to have single-degree of freedom effects
(t tests) of things that appear not to be linear effects: Certainly,
you have more than two nest boxes, but also day of year as a linear
term looks suspicious to me. Unless there is something I have missed
completely, box should be a factor variable, and you might also need
trigonometric terms for the julian effect (depending on what sort of
time spans we are talking about.)

Secondly, notation like box/julian suggests that julian only makes
sense within a nest box i.e. 1st of March in one box is completely
different from 1st of March in another box (the notation is more
commonly used to describe bird number within nests and the like). And
with purbank presumably constant for measurements from the same box,
the box:purbank term looks strange indeed.

If you want to take account of a between-box variation in the effect
of covariates, you probably need to add them as variance components,
but this requires non-glm software, either lme() or lmer(). However,
instructing you on those is outside the scope of this mailing list,
and you may need to find a local consultant.

 The other question I have is how do I test a null hypothesis - no
 explanatory variables?  [rtot ~ NULL?]
 
 Many thanks,
 
 Jeff 
 
 
 
 
 Jeffrey A. Stratford, Ph.D.
 Postdoctoral Associate
 331 Funchess Hall
 Department of Biological Sciences
 Auburn University
 Auburn, AL 36849
 334-329-9198
 FAX 334-844-9234
 http://www.auburn.edu/~stratja
 
mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

-- 
   O__   Peter Dalgaard Ă˜ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] glm with nesting

2006-10-05 Thread Jeffrey Stratford
Harold and list,

I've changed a few things since the last time so I'm really starting
from scratch.  

I start with

bbmale - read.csv(c:\\eabl\\2004\\feathers\\male_feathers2.csv,
header=TRUE)
box -factor(box)
chick - factor(chick)

Here's a sample of the data

box,chick,julian,cltchsz,mrtot,cuv,cblue,purbank,purban2,purban1,pgrassk,pgrass2,pgrass1,grassdist,grasspatchk
1,2,141,2,21.72290152,0.305723811,0.327178868,0.003813435,0.02684564,0.06896552,0.3282487,0.6845638,0.7586207,0,3.73
4,1,164,4,18.87699007,0.281863299,0.310935559,0.06072162,0.2080537,0.06896552,0.01936052,0,0,323.1099,0.2284615
4,2,164,4,19.64359348,0.294117388,0.316049817,0.06072162,0.2080537,0.06896552,0.01936052,0,0,323.1099,0.2284615
7,1,118,4,13.48699876,0.303649408,0.31765218,0.3807568,0.4362416,0.6896552,0.06864183,0.03355705,0,94.86833,0.468
12,1,180,4,21.42196378,0.289731361,0.317562294,0.09238011,0.1342282,0,0.2430127,0.8322148,1,0,1.199032
12,2,180,4,18.79487905,0.286052077,0.316367349,0.09238011,0.1342282,0,0.2430127,0.8322148,1,0,1.199032
12,3,180,4,12.83127682,0.260197475,0.292636914,0.09238011,0.1342282,0,0.2430127,0.8322148,1,0,1.199032
15,1,138,4,20.07161467,0.287632782,0.318671887,0.07046477,0.03355705,0.03448276,0.2755622,0.6577181,0.8275862,0,1.503818
15,2,138,4,17.61146256,0.305581768,0.315848051,0.07046477,0.03355705,0.03448276,0.2755622,0.6577181,0.8275862,0,1.503818
15,3,138,4,20.36397134,0.271795667,0.30539683,0.07046477,0.03355705,0.03448276,0.2755622,0.6577181,0.8275862,0,1.503818
15,4,138,4,20.81940158,0.269468041,0.304160648,0.07046477,0.03355705,0.03448276,0.2755622,0.6577181,0.8275862,0,1.503818

As you can see I have multiple boxes ( 70).  Sometimes I have multiple
chicks per box each having their own response  to mrtot, cuv, and cblue
but the same landscape variables for that box.  Chick number is randomly
assigned and is not an effect I'm interested in.  I'm really not
interested in the box effect either.  I would like to know if landscape
affects the color of chicks (which may be tied into chick
health/physiology).  We also know that chicks get bluer as the season
progresses and that clutch size (cltchsz) has an effect so I'm including
that as covariates.  

Hopefully, this clears things up a bit. 

I do have the MASS and MEMS (Pineiro's) texts in hand. 

Many thanks,

Jeff

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 3-dimensional table

2006-02-05 Thread Jeffrey Stratford
Hi,

Last week my class conducted an experiment by putting out clay
caterpillars to look at the effects of urbanization, color, and location
on caterpillar predation.  There were two sites (urban, rural), three
colors (green, yellow, red) and two locations at each site (edge,
interior).  The entire data set is below.  I've checked out the MASS
book, Dalgaard's book, and the R-help archives and I haven't found
anything that suggests how to set up a spreadsheet for the xtab function
(say, xtab(predation ~ location + site + color, data=class).   It would
not be a problem to input the data by hand but I wouldn't know how to
set that up either.  Any suggestions would be greatly appreciated.  The
class is mostly college sophmores and juniors and biology and education
majors.  We are using R 2.2.1 on Windows XP.   

Again, many thanks,

Jeff

sitelocationcolor   predated
Urban   EdgeGreen   5
Urban   EdgeRed 30
Urban   EdgeYellow  11
Urban   InteriorGreen   11
Urban   InteriorRed 22
Urban   InteriorYellow  22
Rural   EdgeGreen   94
Rural   EdgeRed 40
Rural   EdgeYellow  67
Rural   InteriorGreen   40
Rural   InteriorRed 70
Rural   InteriorYellow  33



Jeffrey A. Stratford, Ph.D.
Postdoctoral Associate
331 Funchess Hall
Department of Biological Sciences
Auburn University
Auburn, AL 36849
334-329-9198
FAX 334-844-9234
http://www.auburn.edu/~stratja

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] nested ANCOVA: still confused

2006-01-25 Thread Jeffrey Stratford
Harold, Kingsford, and R-users,

I settled on using the lmer function.  I think the memory issue was more
a function of my poor coding than an actual memory problem.  I also
switched the label from box to clutch to avoid any potential
confusion with other functions. 

This coding seems to have worked:

 eabl - lmer(rtot ~ sexv + (purban2|clutch), maxIter=1000, data=bb,
na.action=na.omit)

However, I have two remaining questions: (1)how concerned should I be
with the warning message below and (2) is there a way to invoke output
to get an estimate of the effect of purban2 (the proportion of urban
cover 200 m around a box) on feather color (rtot) and if there is a
difference between the sexes?   I used the summary function and it
doesn't tell me much (see output below). 

I'll read up mixed models when Pinheiro arrives but any suggestions for
diagnostics?  I'm going to repeat this study and expand it by doubling
or tripling the number of birds.  

Warning message:
nlminb returned message false convergence (8) 
 in: LMEoptimize-(`*tmp*`, value = list(maxIter = 200, tolerance =
1.49011611938477e-08,  

 summary(eabl)
Linear mixed-effects model fit by REML
Formula: rtot ~ sexv + (purban2 | clutch) 
   Data: bb 
  AIC  BIClogLik MLdeviance REMLdeviance
 5164.284 6997.864 -2052.142   4128.792 4104.284
Random effects:
 Groups   Name   Variance  Std.Dev. Corr

  
 clutch   (Intercept)   502829   709.10 

  
  purban20 1341990  1158.44 -0.477  

  
  purban20.006711409   5683957  2384.10 -0.226  0.082   

  
  purban20.013422821772922  1331.51 -0.386  0.176  0.067  
.
.
.
.
.
# of obs: 235, groups: clutch, 74

Fixed effects:
Estimate Std. Error t value
(Intercept)  5950.01 241.59  24.628
sexvm1509.07 145.73  10.355

Correlation of Fixed Effects:
  (Intr)
sexvm -0.304

Thanks many time over,

Jeff


Jeffrey A. Stratford, Ph.D.
Postdoctoral Associate
331 Funchess Hall
Department of Biological Sciences
Auburn University
Auburn, AL 36849
334-329-9198
FAX 334-844-9234
http://www.auburn.edu/~stratja

 Doran, Harold [EMAIL PROTECTED] 01/25/06 6:37 AM 
OK, we're getting somewhere. First, it looks as though (by the error
message) that you have a big dataset. My first recommendation is to use
lmer instead of lme, you will see a significant benefit in terms of
computional speed.

For the model this would be

lmer(rtot ~ sexv +(purban|box:chick) + (purban|box), bb,
na.action=na.omit)

Now, you have run out of memory. I don't know what operating system you
are using, so go and see the appropriate FAQ for increasing memory for
your OS. 

Second, I made a mistake in my reply. Your random statement should be
random=~purban|box/chick denoting that chicks are nested in boxes, not
boxes nested in chicks, sorry about that.

Now, why is it that each chick within box has the same value for purban?
If this is so, why are you fitting that as a random effect? It seems not
to vary across individual chicks, right? It seems there is only an
effect of box and not an effect for chicks. Why not just fit a random
effect only for box such as:

rtot.lme - lme(fixed=rtot~sexv, random=~purban2|box,
na.action=na.omit,bb)

or in lmer
lmer(rtot ~ sexv + (purban|box), bb, na.action=na.omit)

Harold
 


-Original Message-
From:   Jeffrey Stratford [mailto:[EMAIL PROTECTED]
Sent:   Tue 1/24/2006 8:57 PM
To: Doran, Harold; r-help@stat.math.ethz.ch
Cc: 
Subject:RE: [R] nested ANCOVA: still confused

R-users and Harold.

First, thanks for the advice;  I'm almost there.  

The code I'm using now is 

library(nlme)
bb - read.csv(E:\\eabl_feather04.csv, header=TRUE)
bb$sexv - factor(bb$sexv)
rtot.lme - lme(fixed=rtot~sexv, random=~purban2|chick/box,
na.action=na.omit, data=bb)

A sample of the data looks like this 

box chick   rtotpurban2 sexv
1   1   6333.51 0.026846f
1   2   8710.8840.026846m
2   1   5810.0070.161074f
2   2   5524.33 0.161074f
2   3   4824.4740.161074f
2   4   5617.6410.161074f
2   5   6761.7240.161074f
4   1   7569.6730.208054m
4   2   7877.0810.208054m
4   4   7455.55 0.208054f
7   1   5408.2870.436242m
10  1   6991.7270.14094 f
12  1   8590.2070.134228f
12  2   7536.747

[R] nested ANCOVA: still confused

2006-01-24 Thread Jeffrey Stratford
Dear R-users,

I did some more research and I'm still not sure how to set up an ANCOVA
with nestedness.  Specifically I'm not sure how to express chicks nested
within boxes.  I will be getting Pinheiro  Bates (Mixed Effects Models
in S and S-Plus) but it will not arrive for another two weeks from our
interlibrary loan.

The goal is to determine if there are urbanization (purban) effects on
chick health (rtot) and if there are differences between sexes (sex) and
the effect of being in the same clutch (box).

The model is rtot = sex + purban + (chick)box.

I've loaded the package lme4.  And the code I have so far is

bb - read.csv(C:\\eabl\\eabl_feather04.csv, header=TRUE)
bb$sex - factor(bb$sex)
rtot.lme - lme(bb$rtot~bb$sex, bb$purban|bb$chick/bb$box,
na.action=na.omit)

but this is not working.

Any suggestions would be greatly appreciated.

Thanks,

Jeff









Jeffrey A. Stratford, Ph.D.
Postdoctoral Associate
331 Funchess Hall
Department of Biological Sciences
Auburn University
Auburn, AL 36849
334-329-9198
FAX 334-844-9234
http://www.auburn.edu/~stratja

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] nested ANCOVA: still confused

2006-01-24 Thread Jeffrey Stratford
R-users and Harold.

First, thanks for the advice;  I'm almost there.  

The code I'm using now is 

library(nlme)
bb - read.csv(E:\\eabl_feather04.csv, header=TRUE)
bb$sexv - factor(bb$sexv)
rtot.lme - lme(fixed=rtot~sexv, random=~purban2|chick/box,
na.action=na.omit, data=bb)

A sample of the data looks like this 

box chick   rtotpurban2 sexv
1   1   6333.51 0.026846f
1   2   8710.8840.026846m
2   1   5810.0070.161074f
2   2   5524.33 0.161074f
2   3   4824.4740.161074f
2   4   5617.6410.161074f
2   5   6761.7240.161074f
4   1   7569.6730.208054m
4   2   7877.0810.208054m
4   4   7455.55 0.208054f
7   1   5408.2870.436242m
10  1   6991.7270.14094 f
12  1   8590.2070.134228f
12  2   7536.7470.134228m
12  3   5145.3420.134228m
12  4   6853.6280.134228f
15  1   8048.7170.033557m
15  2   7062.1960.033557m
15  3   8165.9530.033557m
15  4   8348.58 0.033557m
16  2   6534.7750.751678m
16  3   7468.8270.751678m
16  4   5907.3380.751678f
21  1   7761.9830.221477m
21  2   6634.1150.221477m
21  3   6982.9230.221477m
21  4   7464.0750.221477m
22  1   6756.7330.281879f
23  2   8231.4960.134228m

The error I'm getting is

Error in logLik.lmeStructInt(lmeSt, lmePars) : 
Calloc could not allocate (590465568 of 8) memory
In addition: Warning messages:
1: Fewer observations than random effects in all level 2 groups in:
lme.formula(fixed = rtot ~ sexv, random = ~purban2 | chick/box,  
2: Reached total allocation of 382Mb: see help(memory.size) 

There's nothing special about chick 1, 2, etc.  These were simply the
order of the birds measured in each box so chick 1 in box 1 has nothing
to do with chick 1 in box 2.

Many thanks,

Jeff 


Jeffrey A. Stratford, Ph.D.
Postdoctoral Associate
331 Funchess Hall
Department of Biological Sciences
Auburn University
Auburn, AL 36849
334-329-9198
FAX 334-844-9234
http://www.auburn.edu/~stratja

 Doran, Harold [EMAIL PROTECTED] 01/24/06 2:04 PM 
Dear Jeff:

I see the issues in your code and have provided what I think will solve
your problem. It is often much easier to get help on this list when you
provide a small bit of data that can be replicated and you state what
the error messages are that you are receiving. OK, with that said, here
is what I see. First, you do not need to use the syntax bb$sex in your
model, this can be significantly simplified. Second, you do not have a
random statement in your model.

Here is your original model:
lme(bb$rtot~bb$sex, bb$purban|bb$chick/bb$box, na.action=na.omit)

Here is what it should be:

lme(fixed = rtot~sex, random=~purban|chick/box, na.action=na.omit,
data=bb)

Notice there is a fixed and random call. You can simplify this as

lme(rtot~sex, random=~purban|chick/box, na.action=na.omit, bb)

Note, you can eliminate the fixed= portion but not the random
statement.

Last, if you want to do this in lmer, the newer function for mixed
models in the Matrix package, you would do

lmer(rtot~sex + (purban|box:chick) + (purban|box), na.action=na.omit,
data=bb)

Hope this helps.
Harold




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jeffrey Stratford
Sent: Tuesday, January 24, 2006 11:34 AM
To: r-help@stat.math.ethz.ch
Subject: [R] nested ANCOVA: still confused

Dear R-users,

I did some more research and I'm still not sure how to set up an ANCOVA
with nestedness.  Specifically I'm not sure how to express chicks nested
within boxes.  I will be getting Pinheiro  Bates (Mixed Effects Models
in S and S-Plus) but it will not arrive for another two weeks from our
interlibrary loan.

The goal is to determine if there are urbanization (purban) effects on
chick health (rtot) and if there are differences between sexes (sex) and
the effect of being in the same clutch (box).

The model is rtot = sex + purban + (chick)box.

I've loaded the package lme4.  And the code I have so far is

bb - read.csv(C:\\eabl\\eabl_feather04.csv, header=TRUE) bb$sex -
factor(bb$sex) rtot.lme - lme(bb$rtot~bb$sex,
bb$purban|bb$chick/bb$box,
na.action=na.omit)

but this is not working.

Any suggestions would be greatly appreciated.

Thanks,

Jeff









Jeffrey A. Stratford, Ph.D.
Postdoctoral Associate
331 Funchess Hall
Department of Biological Sciences
Auburn

[R] regression with nestedness

2006-01-22 Thread Jeffrey Stratford
Dear R-users,

I set up an experiment where I put up bluebird boxes across an
urbanization gradient.  I monitored these boxes and at some point I
pulled a feather from a chick and a friend used spectral properties
(rtot, a continuous var) to index chick health.  There is an effect of
sex that I would like to include but how would I set up a regression and
look at the effect of urbanization (purban, a continuous var)) on
feather properties of chicks within boxes.  

So the model should look something like rtot = sex + purban +
(chick)clutch

Also, when I plot purban against rtot using the plot function I get
boxplots but I would like to ignore the clutch and just plot each point.
 I've tried type = p but this has no effect.  

Thanks,

Jeff

 


Jeffrey A. Stratford, Ph.D.
Postdoctoral Associate
331 Funchess Hall
Department of Biological Sciences
Auburn University
Auburn, AL 36849
334-329-9198
FAX 334-844-9234
http://www.auburn.edu/~stratja

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] cv.glm help: no responses?

2005-11-22 Thread Jeffrey Stratford
I sent out a question a few days ago asking help with cv.glm and I
didn't get any responses.  Is it my question?  I'd appreciate any
response just to see my post makes it out there. 

What I'm looking to do is to get a column of predicted responses from
cv.glm (leave-one-out).  This produces a single predicted value for each
case, correct?  How can I get those values?  I don't want predicted
values using a full model but just the values that come from the cross
validation.  

Thanks,

Jeff

Here's the code I used for the model and to get delta: 

q_uk.nb - glm.nb(nat_est ~ UK + I(UK^2), SRCOUNT, link=log)
cv.err1 - cv.glm(SRCOUNT,q_uk.nb)



Jeffrey A. Stratford, Ph.D.
Postdoctoral Associate
331 Funchess Hall
Department of Biological Sciences
Auburn University
Auburn, AL 36849
334-329-9198
FAX 334-844-9234
http://www.auburn.edu/~stratja

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] predicted values from cv.glm

2005-11-21 Thread Jeffrey Stratford
Hi folks,

Is there a way to get the predicted values from leave-one-out cross
validation using cv.glm?   More generally, is there a way to see what
output is available with any function that may not show up using the
help() function?

Below is the code that I've been using:

SRCOUNT - read.table(file.choose(),header=T)  
library(boot)
library(MASS)
q_u2 - glm.nb(res_est ~ U2 + I(U2^2), SRCOUNT, link = log)
cv.qu2- cv.glm(SRCOUNT,qu2)

Many thanks,

Jeff 


Jeffrey A. Stratford, Ph.D.
Postdoctoral Associate
331 Funchess Hall
Department of Biological Sciences
Auburn University
Auburn, AL 36849
334-329-9198
FAX 334-844-9234
http://www.auburn.edu/~stratja

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] predicted values from cv.glm

2005-11-19 Thread Jeffrey Stratford
Hi. Is there a way to get the values predicted from (leave-one-out)
cv.glm?  

It seems like a useful diagnostic to plot observed vs. predicted values.

Thanks,

Jeff


Jeffrey A. Stratford, Ph.D.
Postdoctoral Associate
331 Funchess Hall
Department of Biological Sciences
Auburn University
Auburn, AL 36849
334-329-9198
FAX 334-844-9234
http://www.auburn.edu/~stratja

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] cv.glm help

2005-11-17 Thread Jeffrey Stratford
I would appreciate some help writing R code to plot predicted species
richness vs. observed species richness (nat_est) with 95% CI lines.  

I'm using glm to get model coefficients and remove-one cross validation
to get predicted (cv.glm). 

Here is what I have for the code so far:

SRCOUNT - read.table(file.choose(),header=T) 
library(boot)
library(MASS)
quk.native - glm.nb(nat_est ~ UK + I(UK^2), SRCOUNT, link=log)
cv.quk  - cv.glm(SRCOUNT, quk)

Many thanks,

Jeff


Jeffrey A. Stratford, Ph.D.
Postdoctoral Associate
331 Funchess Hall
Department of Biological Sciences
Auburn University
Auburn, AL 36849
334-329-9198
FAX 334-844-9234
http://www.auburn.edu/~stratja

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] glm x^2

2005-11-11 Thread Jeffrey Stratford
R-users,

I'm having some trouble getting .glm and glm.nb to run a polynomial. 
I've used x*x and x^2 and neither works.  I've checked out the archives
and they refer to an archive that's no longer working.  

I've seen that they use poly() but I'm following up my analysis with
cv.glm so I'd prefer to keep using glm.  It's easier to just add a
column to my data but I'd rather code it.

Thanks for the response... I appreciate the people that work on the
list.

Jeff


Jeffrey A. Stratford, Ph.D.
Postdoctoral Associate
331 Funchess Hall
Department of Biological Sciences
Auburn University
Auburn, AL 36849
334-329-9198
FAX 334-844-9234
http://www.auburn.edu/~stratja

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Poisson/negbin followed by jackknife

2005-11-07 Thread Jeffrey Stratford
Folks,

Thanks for the help with the hier.part analysis.  All the problems
stemmed from an import problem which was solved with file.chose().

Now that I have the variables that I'd like to use I need to run some
GLM models.  I think I have that part under control but I'd like to use
a jackknife approach to model validation (I was using a hold out sample
but this seems to have fallen out of favor).  

I'd appreciate it if someone could just point me in the right direction
for the jackkife analysis given a particular distribution, coefficients,
etc.

Thanks,

Jeff


Jeffrey A. Stratford, Ph.D.
Postdoctoral Associate
331 Funchess Hall
Department of Biological Sciences
Auburn University
Auburn, AL 36849
334-329-9198
FAX 334-844-9234
http://www.auburn.edu/~stratja

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] index question

2005-11-01 Thread Jeffrey Stratford
Thanks for those on the list that answered my previous question.  I'm
just about where I need to be (looking at output).  

In the hier.part documentation there is a line env - urbanwq[,2:8].  

This means use rows 2 through 8 in the data frame urbanwq, right? 
What does the comma represent?  If one wasn't using column headers would
this be necessary?

Thanks,

Jeff




Jeffrey A. Stratford, Ph.D.
Postdoctoral Associate
331 Funchess Hall
Department of Biological Sciences
Auburn University
Auburn, AL 36849
334-329-9198
FAX 334-844-9234
http://www.auburn.edu/~stratja

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] help with hier.part

2005-11-01 Thread Jeffrey Stratford
R-users,

Attached is the file  (SR_use2.txt) I'd like to include and includes
column headers.  nat_est is the response variable and is the number of
species at a particular point.  The other variables are the explanatory
vars (vark, var2, var1, UK, U2, U1, GK, G2, G1, PK, P2, P1).  

Here is Walsh's sample code for hier.part:

data(urbanwq)
env - urbanwq[,2,8]
hier.part(urbanwq$lec, env, fam=gaussian, gof=Rssqu)

The code I wrote is 

library(hier.part)
SRUSE- read.table(F:\\GEORGIA\\species_richness\\SR_use2.txt, sep=
, header = TRUE, row.names = 1)
TEMP- SRUSE[2:13]
hier.part(SRUSE$nat_est,TEMP, family=NegBin, gof=logLik, barplot=
TRUE)

So far this doesn't work and I'd really appreciate some help.

While I have your ears, what books would one make for the clueless?  

Many thanks,

Jeff

PS. nat_est is the estimated number of species (species richness). 
Around each of the sampling points I calculated the % of different types
of cover (pine, hardwoods, number of different covers) in three scales
around the sampling points (1000, 200, and 100 m).  What I'm hoping to
do with the analysis is to find the best scales and parameters that best
predicts species richness. 
.  


Jeffrey A. Stratford, Ph.D.
Postdoctoral Associate
331 Funchess Hall
Department of Biological Sciences
Auburn University
Auburn, AL 36849
334-329-9198
FAX 334-844-9234
http://www.auburn.edu/~stratja

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Import help (neophyte)

2005-10-31 Thread Jeffrey Stratford
Hi, I have no experience with R and I'm finding the manuals a bit obtuse
and written as if I already understood R.  

I'm trying to import a csv file from a floppy and it's not working. The
code I'm using is 

read.table(F:\GEORGIA\species_richness\SR_use.csv, sep=,, header =
TRUE, row.names = 1) 

I'm assuming that this command is case sensitive so everything matches. 
 Is there something I'm missing?  

Thanks,

Jeff 



Jeffrey A. Stratford, Ph.D.
Postdoctoral Associate
331 Funchess Hall
Department of Biological Sciences
Auburn University
Auburn, AL 36849
334-329-9198
FAX 334-844-9234
http://www.auburn.edu/~stratja

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html