date:20090519

Re: [R] Finding data association in R

2009-05-19 Thread phen_ys




Johannes Hüsing wrote:
 
 
 Am 19.05.2009 um 05:39 schrieb phen_ys:
 

 surgery - data.frame(outcome = c(0, 0, 0, 0, 0, 0, 0, 0, 0,
 + 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0,
 + 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0), age = c(50, 50, 51,
 + 51, 53, 54, 54, 54, 55, 55, 56, 56, 56, 57, 57, 57, 57, 58,
 + 59, 60, 61, 61, 61, 62, 62, 62, 62, 63, 63, 63, 64, 64, 65,
 + 67, 67, 68, 68, 69, 70, 71))

 How to use R to find association of the death rate and age with the 
 above
 data?

 
 with(surgery, boxplot(age ~ outcome))
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

I try to fit this into the model use lm function, but it doesn't make much
sense. The question i'm trying to answer is whether death rate is associated
with age. E.g the death rate is higher when the age is older.

-- 
View this message in context: 
http://www.nabble.com/Finding-data-association-in-R-tp23609249p23610952.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Finding data association in R

2009-05-19 Thread Bill.Venables

Your problem is statistical and has nothing particularly to do with R.

It looks like homework to me.  

You may care to look at it this way:

###

 fm - glm(outcome ~ age, binomial, surgery)
 summary(fm)

Call:
glm(formula = outcome ~ age, family = binomial, data = surgery)

Deviance Residuals: 
Min   1Q   Median   3Q  Max  
-1.6601  -0.8099  -0.5839   1.0491   1.7079  

Coefficients:
 Estimate Std. Error z value Pr(|z|)
(Intercept) -10.481744.30409  -2.435   0.0149
age   0.162950.07018   2.322   0.0202

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 51.796  on 39  degrees of freedom
Residual deviance: 45.301  on 38  degrees of freedom
AIC: 49.301

Number of Fisher Scoring iterations: 3

###
 
There are still a few questions left, of course.



Bill Venables
http://www.cmis.csiro.au/bill.venables/ 


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of phen_ys
Sent: Tuesday, 19 May 2009 5:17 PM
To: r-help@r-project.org
Subject: Re: [R] Finding data association in R




Johannes Hüsing wrote:
 
 
 Am 19.05.2009 um 05:39 schrieb phen_ys:
 

 surgery - data.frame(outcome = c(0, 0, 0, 0, 0, 0, 0, 0, 0,
 + 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0,
 + 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 1, 0), age = c(50, 50, 51,
 + 51, 53, 54, 54, 54, 55, 55, 56, 56, 56, 57, 57, 57, 57, 58,
 + 59, 60, 61, 61, 61, 62, 62, 62, 62, 63, 63, 63, 64, 64, 65,
 + 67, 67, 68, 68, 69, 70, 71))

 How to use R to find association of the death rate and age with the 
 above
 data?

 
 with(surgery, boxplot(age ~ outcome))
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

I try to fit this into the model use lm function, but it doesn't make much
sense. The question i'm trying to answer is whether death rate is associated
with age. E.g the death rate is higher when the age is older.

-- 
View this message in context: 
http://www.nabble.com/Finding-data-association-in-R-tp23609249p23610952.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Predicting complicated GAMMs on response scale

2009-05-19 Thread Gavin Simpson

On Mon, 2009-05-18 at 11:48 -0700, William Paterson wrote:
 Hi,
 
 I am using GAMMs to show a relationship of temperature differential over
 time with a model that looks like this:-
 
 gamm(Diff~s(DaysPT)+AirToC,method=REML) 
 
 where DaysPT is time in days since injury and Diff is repeat measures of
 temperature differentials with regards to injury sites compared to
 non-injured sites in individuals over the course of 0-24 days. I use the
 following code to plot this model on the response scale with 95% CIs which
 works fine:-
 
 g.m-gamm(Diff~s(DaysPT)+AirToC,method=REML)
 p.d-data.frame(DaysPT=seq(min(DaysPT),max(DaysPT)))
 p.d$AirToC-(6.7)
 b-predict.gam(g.m$gam,p.d,se=TRUE)
 range-c(min(b$fit-2*b$se.fit),max(b$fit+2*b$se.fit))
 plot(p.d$DaysPT,b$fit,ylim=c(-4,12),xlab=Days post-tagging,ylab=dTmax
 (ºC),type=l,lab=c(24,4,12),las=1,cex.lab=1.5, cex.axis=1,lwd=2)
 lines(p.d$DaysPT,b$fit+b$se.fit*1.96,lty=2,lwd=1.5)
 lines(p.d$DaysPT,b$fit-b$se.fit*1.96,lty=2,lwd=1.5)
 points(DaysPT,Diff)
 
 
 However, when I add a correlation structure and/or a variance structure so
 that the model may look like:- 
 
 
 gamm(Diff~s(DaysPT3)+AirToC,correlation=corCAR1(form=~DaysPT|
 Animal),weights=varPower(form=~DaysPT),method=REML)
 
 
 I get this message at the point of inputting the line
 b-predict.gam(g.m$gam,p.d,se=TRUE)

Note that p.d doesn't contain Animal. Not sure this is the problem, but
I would have thought you'd need to supply new values of Animal for the
data you wish to predict for in order to get the CAR(1) errors correct.
Is it possible that the model is finding another Animal variable in the
global environment?

I have predicted from several thousand GAMMs containing correlation
structures similar to the way you do above so this does work in general.
If the above change to p.d doesn't work, you'll probably need to speak
to Simon Wood to take this further.

Is mgcv up-to-date? I am using 1.5-5 that was released in the last week
or so.

For example, this dummy example runs without error for me and is similar
to your model

y1 - arima.sim(list(order = c(1,0,0), ar = 0.5), n = 200, sd = 1)
y2 - arima.sim(list(order = c(1,0,0), ar = 0.8), n = 200, sd = 3)
x1 - rnorm(200)
x2 - rnorm(200)
ind - rep(1:2, each = 200)
d - data.frame(Y = c(y1,y2), X = c(x1,x2), ind = ind, time = rep(1:200,
times = 2))
require(mgcv)
mod - gamm(Y ~ s(X), data = d, corr = corCAR1(form = ~ time | ind),
weights = varPower(form = ~ time))
p.d - data.frame(X = rep(seq(min(d$X), max(d$X), len = 20), 2),
  ind = rep(1:2, each = 20),
  time = rep(1:20, times = 2))
pred - predict(mod$gam, newdata = p.d, se = TRUE)

Does this work for you? If not, the above would be a reproducible
example (as asked for in the posting guide) and might help Simon track
down the problem if you are running an up-to-date mgcv.

HTH

G

 
 
 Error in model.frame(formula, rownames, variables, varnames, extras,
 extranames,  : 
 variable lengths differ (found for 'DaysPT')
 In addition: Warning messages:
 1: not all required variables have been supplied in  newdata!
  in: predict.gam(g.m$gam, p.d, se = TRUE) 
 2: 'newdata' had 25 rows but variable(s) found have 248 rows 
 
 
 Is it possible to predict a more complicated model like this on the response
 scale? How can I incorporate a correlation structure and variance structure
 in a dataframe when using the predict function for GAMMs?
 
 Any help would be greatly appreciated.
 
 William Paterson
 
 
 
 
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Overdispersion using repeated measures lmer

2009-05-19 Thread ONKELINX, Thierry

Dear Christine,

The poisson family does not allow for overdispersion (nor
underdispersion). Try using the quasipoisson family instead.

HTH,

Thierry

 




ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium
tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey

-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
Namens Christine Griffiths
Verzonden: maandag 18 mei 2009 13:26
Aan: r-help@r-project.org
Onderwerp: [R] Overdispersion using repeated measures lmer

Dear All

I am trying to do a repeated measures analysis using lmer and have a
number of issues. I have non-orthogonal, unbalanced data.  Count data
was obtained over 10 months for three treatments, which were arranged
into 6 blocks. 
Treatment is not nested in Block but crossed, as I originally designed
an orthogonal, balanced experiment but subsequently lost a treatment
from 2 blocks. My fixed effects are treatment and Month, and my random
effects are Block which was repeated sampled.  My model is:

Model-lmer(Count~Treatment*Month+(Month|Block),data=dataset,family=pois
son(link=sqrt))

Is this the only way in which I can specify my random effects? I.e. can
I specify them as: (1|Block)+(1|Month)?

When I run this model, I do not get any residuals in the error term or
estimated scale parameters and so do not know how to check if I have
overdispersion. Below is the output I obtained.

Generalized linear mixed model fit by the Laplace approximation
Formula: Count ~ Treatment * Month + (Month | Block)
   Data: dataset
   AIC   BIC logLik deviance
 310.9 338.5 -146.4292.9
Random effects:
 Groups NameVariance   Std.Dev. Corr
 Block  (Intercept) 0.06882396 0.262343
Month   0.00011693 0.010813 1.000
Number of obs: 160, groups: Block, 6

Fixed effects:
  Estimate Std. Error z value Pr(|z|)
(Intercept)   1.624030   0.175827   9.237   2e-16 ***
Treatment2.Radiata0.150957   0.207435   0.728 0.466777
Treatment3.Aldabra   -0.005458   0.207435  -0.026 0.979009
Month-0.079955   0.022903  -3.491 0.000481 ***
Treatment2.Radiata:Month  0.048868   0.033340   1.466 0.142717
Treatment3.Aldabra:Month  0.077697   0.033340   2.330 0.019781 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
(Intr) Trt2.R Trt3.A Month  T2.R:M Trtmnt2.Rdt -0.533
Trtmnt3.Ald -0.533  0.450
Month   -0.572  0.585  0.585
Trtmnt2.R:M  0.474 -0.882 -0.402 -0.661
Trtmnt3.A:M  0.474 -0.402 -0.882 -0.661  0.454


Any advice on how to account for overdispersion would be much
appreciated.

Many thanks in advance
Christine

--
Christine Griffiths
School of Biological Sciences
University of Bristol
Woodland Road
Bristol BS8 1UG
Tel: 0117 9287593
Fax 0117 925 7374
christine.griffi...@bristol.ac.uk
http://www.bio.bris.ac.uk/research/mammal/tortoises.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer 
en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document. The views expressed in  this message 
and any annex are purely those of the writer and may not be regarded as stating 
an official position of INBO, as long as the message is not confirmed by a duly 
signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] As mat.or.vec

2009-05-19 Thread Barbara . Rogo

is there a command like mat.or.vec for an array that I have to create with a 
cicle for?
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Generic 'diff'

2009-05-19 Thread Wacek Kusnierczyk

Stavros Macrakis wrote:
 On Mon, May 18, 2009 at 6:00 PM, Gabor Grothendieck ggrothendi...@gmail.com
   
 wrote:
 

   
 I understood what you were asking but R is an oo language so
 that's the model to use to do this sort of thing.

 

 I am not talking about creating a new class with an analogue to the
 subtraction function.  I am talking about a function which applies another
 function to a sequence and its lagged version.

 Functional arguments are used all over the place in R's base package
 (Xapply, sweep, outer, by, not to mention Map,  Reduce, Filter, etc.) and
 they seem perfectly natural here.
   

perhaps 'diff' would not be the best name, something like 'lag' would be
better for the more generic function, but 'lag' is already taken.

i agree it would be reasonable to have diff (lag) to accept an extra
argument for the function to be applied.  the solution of wrapping the
vector into a new class to be diff'ed with a non-default diff does not
seem to make much sense, as (a) what you seem to want is to custom-diff
plain vectors, (b) to keep the diff family coherent, you'd need to
upgrade the other diffs to have the extra argument anyway.

as you say, it's trivial to implement an extended diff, say difff,
reusing code from diff:

difff = function(x, ...)
   UseMethod('difff')
difff.default = function(x, lag=1, differences=1, fun=`-`, ...) {
   ismat = is.matrix(x)
   xlen = if (ismat) dim(x)[1L] else length(x)
if (length(lag)  1L || length(differences)  1L || lag  1L ||
differences  1L)
   stop('lag' and 'differences' must be integers = 1)
if (lag * differences = xlen) return(x[0])
r = unclass(x)
i1 = -1L:-lag
if (ismat)
for (i in 1L:differences)
r = fun(r[i1, , drop = FALSE], r[-nrow(r):-(nrow(r) - lag +
1), , drop = FALSE])
else
for (i in 1L:differences) r = fun(r[i1],
r[-length(r):-(length(r) - lag + 1)])
class(r) = oldClass(x)
r }

now, this naive version seems to work close to what you'd like:

difff(1:4)
# 1 1 1

difff(1:4, fun=`+`)
# 3 5 7

it might be useful if the original diff were working this way.

vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] fitting distribution

2009-05-19 Thread Stefan Grosse

On Tue, 19 May 2009 14:04:19 +1000 Kon Knafelman konk2...@hotmail.com
wrote:

KK i have the sample variances for 1000 samples, and i want to fit it
KK to a chi-squared distribution.

KK can someone please help me fit this to a chi-squared distribution
KK with n degrees of freedom. Thanks a lot 

Dear Kon, 

1. please only mail to r-h...@stat.math.ethz.ch OR to
r-help@r-project.org as we receive every mail of you twice if
you mail to both. 

2. please read the posting guide. A link is attached
every e-mail:

PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html and provide commented,
minimal, self-contained, reproducible code.

especially the section do your homework. It means use search on
r-project.org and rseek.org and have a look at the documentation.

Don't expect others to do your homework. And show some effort!

Using search would have pointed you to a guide of Ricci on fitting
distributions for example:
http://cran.r-project.org/doc/contrib/Ricci-distributions-en.pdf

If you programmed something and it does not work as expected- THEN mail
to the list. 

hth
Stefan

PS You don't know Debbie Zhang by coincidence?...

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Generic 'diff'

2009-05-19 Thread Wacek Kusnierczyk

Wacek Kusnierczyk wrote:
 Stavros Macrakis wrote:
   


[...]
 I am not talking about creating a new class with an analogue to the
 subtraction function.  I am talking about a function which applies another
 function to a sequence and its lagged version.

 Functional arguments are used all over the place in R's base package
 (Xapply, sweep, outer, by, not to mention Map,  Reduce, Filter, etc.) and
 they seem perfectly natural here.
   
 

   
[...]

 as you say, it's trivial to implement an extended diff, say difff,
 reusing code from diff:

 difff = function(x, ...)
UseMethod('difff')
 difff.default = function(x, lag=1, differences=1, fun=`-`, ...) {
ismat = is.matrix(x)
xlen = if (ismat) dim(x)[1L] else length(x)
 if (length(lag)  1L || length(differences)  1L || lag  1L ||
 differences  1L)
stop('lag' and 'differences' must be integers = 1)
   

btw., the error message here is confusing:

lag = 1:2
diff(1:10, lag=lag)
# Error in diff.default(1:10, lag = lag) :
#  'lag' and 'differences' must be integers = 1

is.integer(lag)
# TRUE
all(lag = 1)
# TRUE
  
what is meant is that lag and differences must be atomic 1-element
vectors of positive integers.  or rather integer-representing numerics:

lag = 1
diff(1:5, lag=1)
# fine
is.integer(lag)
# FALSE

(the usual confusion between 'integer' as the underlying representation
and 'integer' as the represented number.)

vQ

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Overdispersion using repeated measures lmer

2009-05-19 Thread Christine Griffiths

Thanks. I did try using quasipoisson and a negative binomial error but am
unsure of the degree of overdispersion and whether it is simply due to
missing values. I am investigating to see if I can replace these missing
values so that I can have a balanced orthogonal design and use lme or aov
instead which is easier to interpret. Any ideas on whether it is feasible to
replace missing values for a small dataset with repeated measures? I have 6
blocks with 3 treatments sampled over 10 months. Two blocks are missing one
treatment, albeit a different one. Also any suggestions about how I would go
about this would be much appreciated.

I am also unsure of whether my random effects (Month|Block) for repeated
measures with random slope and intercept is correct and whether (1|Month) +
(1|Block) represents repeated measures. Any confirmation would be great.

Cheers
Christine

Christine Griffiths-2 wrote:

Dear All

I am trying to do a repeated measures analysis using lmer and have a
number
of issues. I have non-orthogonal, unbalanced data. Count data was
obtained
over 10 months for three treatments, which were arranged into 6 blocks.
Treatment is not nested in Block but crossed, as I originally designed an
orthogonal, balanced experiment but subsequently lost a treatment from 2
blocks. My fixed effects are treatment and Month, and my random effects
are
Block which was repeated sampled. My model is:

Model-lmer(Count~Treatment*Month+(Month|Block),data=dataset,family=poisson(link=sqrt))

Is this the only way in which I can specify my random effects? I.e. can I
specify them as: (1|Block)+(1|Month)?

When I run this model, I do not get any residuals in the error term or
estimated scale parameters and so do not know how to check if I have
overdispersion. Below is the output I obtained.

Generalized linear mixed model fit by the Laplace approximation
Formula: Count ~ Treatment * Month + (Month | Block)
Data: dataset
AIC BIC logLik deviance
310.9 338.5 -146.4292.9
Random effects:
Groups NameVariance Std.Dev. Corr
Block (Intercept) 0.06882396 0.262343
Month 0.00011693 0.010813 1.000
Number of obs: 160, groups: Block, 6

Fixed effects:
Estimate Std. Error z value Pr(|z|)
(Intercept) 1.624030 0.175827 9.237 2e-16 ***
Treatment2.Radiata0.150957 0.207435 0.728 0.466777
Treatment3.Aldabra -0.005458 0.207435 -0.026 0.979009
Month-0.079955 0.022903 -3.491 0.000481 ***
Treatment2.Radiata:Month 0.048868 0.033340 1.466 0.142717
Treatment3.Aldabra:Month 0.077697 0.033340 2.330 0.019781 *
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Correlation of Fixed Effects:
(Intr) Trt2.R Trt3.A Month T2.R:M
Trtmnt2.Rdt -0.533
Trtmnt3.Ald -0.533 0.450
Month -0.572 0.585 0.585
Trtmnt2.R:M 0.474 -0.882 -0.402 -0.661
Trtmnt3.A:M 0.474 -0.402 -0.882 -0.661 0.454

Any advice on how to account for overdispersion would be much appreciated.

Many thanks in advance
Christine

--
Christine Griffiths
School of Biological Sciences
University of Bristol
Woodland Road
Bristol BS8 1UG
Tel: 0117 9287593
Fax 0117 925 7374
christine.griffi...@bristol.ac.uk
http://www.bio.bris.ac.uk/research/mammal/tortoises.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
View this message in context:
http://www.nabble.com/Overdispersion-using-repeated-measures-lmer-tp23595955p23612349.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

93 matches

Mail list logo