Re: [R] lme: subject-specific slopes.

2012-12-04 Thread Brian S Cade
John:  I think you want the output from coef(fitRIRT).   The 
ranef(fitRIRT) will give you the subject specific random effect deviations 
from the fixed effects means.  The coef(fitRIRT) will give you the 
combination of the fixed effect means with the subject specific random 
effect deviations and, thus, in the scale you are expecting.

Brian

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  brian_c...@usgs.gov
tel:  970 226-9326



From:
John Sorkin jsor...@grecc.umaryland.edu
To:
r-help@r-project.org, Kenneth Frost kfr...@wisc.edu
Date:
12/04/2012 09:10 AM
Subject:
Re: [R] lme: subject-specific slopes.
Sent by:
r-help-boun...@r-project.org



Ken,
Thank you for your help. ranef(fitRIRT) does not give me what I expect. 
The subject-specific slopes, and subject-specific intercepts are not 
anywhere close to what I would expect them to be; the mean of the 
subject-specfic values should be close to those reported by 
summary(fitRIRT) and they are not. As you will see by examining the 
material below, the subject-specific slopes are off by many order of 
magnitude. The intercepts are also far from the value reported in 
summary(fitRIRT). Do you have any thoughts?
Thanks,
John
 
 fitRIRT - lme(echogen~time,random=~ 
1+time|subject,data=repeatdata,na.action=na.omit) summary(fitRIRT)Linear 
mixed-effects model fit by REML
 Data: repeatdata 
  AIC  BIClogLik
  495.097 507.2491 -241.5485

Random effects:
 Formula: ~1 + time | subject
 Structure: General positive-definite, Log-Cholesky parametrization
StdDev   Corr 
(Intercept) 1.917511e+01 (Intr)
time2.032276e-04 0 
Residual1.044601e+01 

Fixed effects: echogen ~ time 
   Value Std.Error DF   t-value p-value
(Intercept) 64.54864  4.258235 32 15.158543  0.
time 0.35795  0.227080 32  1.576307  0.1248
 Correlation: 
 (Intr)
time -0.242

Standardized Within-Group Residuals:
Min  Q1 Med  Q3 Max 
-1.61362755 -0.52710871  0.02948008  0.41793322  1.77340082 

Number of Observations: 58
Number of Groups: 25  ranef(fitRIRT)   (Intercept)  time
1-3.278112  2.221016e-09
2   -35.400618  4.314995e-08
311.493110 -6.797543e-09
4   -16.209586 -7.070834e-08
5 3.585227 -2.389705e-08
6 1.614320 -1.967700e-09
7 8.346905  5.827094e-08
830.917812 -3.768584e-08
9-0.394101 -9.158251e-09
104.437509 -4.057971e-08
11   31.956597 -2.126275e-08
12   41.567402 -4.853942e-08
13  -10.723993  1.307152e-08
14   -4.554837  5.551908e-09
15   -4.554501  4.815086e-08
16   13.296985 -3.743967e-08
17   -8.255439  1.733238e-08
18  -21.317239  2.203885e-08
19  -13.480159  2.194016e-08
20  -13.044766  2.269168e-08
21   11.639198 -1.418706e-08
22  -27.457388 -1.154099e-08
232.194001 -5.509119e-09
24   -3.992646  7.682188e-08
251.614320 -1.967700e-09
 
 


 
 
John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing) 
Kenneth Frost kfr...@wisc.edu 12/4/2012 10:44 AM 
I think this might be close to what you want. ranef(fitRIRT)

On 12/04/12, John Sorkin  wrote:
 I am running a random intercept random slope regression:
 
 fitRIRT - lme(echogen~time,random=~ 
1+time|subject,data=repeatdata,na.action=na.omit)
 summary(fitRIRT)
 
 I would like to get the subject-specific slopes, i.e. the slope that the 
model computes for each subject. If I have 10-subjects I should have 
10-slopes. I don't see the slope when I look at the items listed in 
 names(summary(fitRIRT)
 nor when I look at the items listed in 
 names(fitRIRT)
 
 Thanks 
 John
 
 
 
 
 
 
 John David Sorkin M.D., Ph.D.
 Chief, Biostatistics and Informatics
 University of Maryland School of Medicine Division of Gerontology
 Baltimore VA Medical Center
 10 North Greene Street
 GRECC (BT/18/GR)
 Baltimore, MD 21201-1524
 (Phone) 410-605-7119
 (Fax) 410-605-7913 (Please call phone number above prior to faxing)
 Confidentiality Statement:
 This email message, including any attachments, is for the sole use of 
the intended recipient(s) and may contain confidential and privileged 
information. Any unauthorized use, disclosure or distribution is 
prohibited. If you are not the intended recipient, please contact the 
sender by reply email and destroy all copies of the original message. 
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

--
Kenneth Frost
Graduate Research Assistant - Dept. of Plant Pathology
University of Wisconsin - Madison
Lab

[R] quantreg installation and conflicts with R 2.15.2

2012-11-30 Thread Brian S Cade
I recently lost the partitions on my hard drive (second time in 6 months) 
so I had to have our IT folks image all my files over to a new drive.  I 
completely reinstalled R (now 2.15.2) and all my libraries to my computer 
(Dell Latitude running Windows 7).  A few of my previous workspaces 
(created with R 2.14.1) can't be restored, reporting an error similar to 
the one I get when I try to load quantreg package which requires 
SparseM (see below).   So, not only will quantreg not load but some of 
my workspaces can't be restored when being loaded (see below).  Not sure 
about what this is about.   I asked Roger Koenker, the package maintainer, 
but he is on travel and won't have chance to seriously investigate this 
for awhile.  So I thought I would put it out to the R community and see if 
anyone has any suggestions about why this conflict might be occurring.

 library(quantreg)
Loading required package: SparseM
Error : object ?kronecker? is not exported by 'namespace:methods'
Error: package ?SparseM? could not be loaded

or 

Error: object ?kronecker? is not exported by 'namespace:methods'
During startup - Warning message:
unable to restore saved data in C:\CADESTUFF\DATA\BarryNoon\.RData 


Brian

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  brian_c...@usgs.gov
tel:  970 226-9326
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fw: quantreg installation and conflicts with R 2.15.2

2012-11-30 Thread Brian S Cade
Just noticed that I get a similar error about object 'kronecker' in 
Matrix package when trying to load lme4.   So this is a more pervasive 
problem. 

Brian

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  brian_c...@usgs.gov
tel:  970 226-9326
- Forwarded by Brian S Cade/BRD/USGS/DOI on 11/30/2012 02:07 PM -

From:
Brian S Cade/BRD/USGS/DOI
To:
R-help@r-project.org
Date:
11/30/2012 01:16 PM
Subject:
quantreg installation and conflicts with R 2.15.2


I recently lost the partitions on my hard drive (second time in 6 months) 
so I had to have our IT folks image all my files over to a new drive.  I 
completely reinstalled R (now 2.15.2) and all my libraries to my computer 
(Dell Latitude running Windows 7).  A few of my previous workspaces 
(created with R 2.14.1) can't be restored, reporting an error similar to 
the one I get when I try to load quantreg package which requires 
SparseM (see below).   So, not only will quantreg not load but some of 
my workspaces can't be restored when being loaded (see below).  Not sure 
about what this is about.   I asked Roger Koenker, the package maintainer, 
but he is on travel and won't have chance to seriously investigate this 
for awhile.  So I thought I would put it out to the R community and see if 
anyone has any suggestions about why this conflict might be occurring.

 library(quantreg)
Loading required package: SparseM
Error : object ?kronecker? is not exported by 'namespace:methods'
Error: package ?SparseM? could not be loaded

or 

Error: object ?kronecker? is not exported by 'namespace:methods'
During startup - Warning message:
unable to restore saved data in C:\CADESTUFF\DATA\BarryNoon\.RData 


Brian

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  brian_c...@usgs.gov
tel:  970 226-9326

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Kolmogorov-Smirnov test and the plot of max distance between two ecdf curves

2012-10-05 Thread Brian S Cade
Another alternative is to put the data in a linear model structure (1 
column for the response, another column for an indicator variable 
indicating group) and estimate all possible quantile regressions with rq() 
in quantreg package using a model with y ~ intercept + indicator (0,1) 
variable for group.   The estimated quantiles for the intercept will be 
the quantiles of the ecdf for one group and the estimated quantiles for 
the indicator grouping variable will be the differences in quantiles 
(ecdf) between the two groups.   There is useful built in graphing 
capability in quantreg with the plot.rqs() function.

Brian

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  brian_c...@usgs.gov
tel:  970 226-9326



From:
user1234 mehenderso...@gmail.com
To:
r-help@r-project.org
Date:
10/05/2012 06:46 AM
Subject:
Re: [R] Kolmogorov-Smirnov test and the plot of max distance between two 
ecdf curves
Sent by:
r-help-boun...@r-project.org



Rui, 

Your response nearly answered a similar question of mine except that I 
also
have ecdfs of different lengths. 

Do you know how I can adjust  x - seq(min(loga, logb), max(loga, logb),
length.out=length(loga)) 
to account for this?  It must be in length.out() but I'm unsure how to
proceed.

Any advice is much appreciated.

-L


Rui Barradas wrote
 Hello,
 
 Try the following.
 (i've changed the color of the first ecdf.)
 
 
 loga - log10(a+1) # do this
 logb - log10(b+1) # only once
 
 f.a - ecdf(loga)
 f.b - ecdf(logb)
 # (2) max distance D
 
 x - seq(min(loga, logb), max(loga, logb), length.out=length(loga))
 x0 - x[which( abs(f.a(x) - f.b(x)) == max(abs(f.a(x) - f.b(x))) )]
 y0 - f.a(x0)
 y1 - f.b(x0)
 
 plot(f.a, verticals=TRUE, do.points=FALSE, col=blue)
 plot(f.b, verticals=TRUE, do.points=FALSE, col=green, add=TRUE)
 ## alternatine, use standard R plot of ecdf
 #plot(f.a, col=blue)
 #lines(f.b, col=green)
 
 points(c(x0, x0), c(y0, y1), pch=16, col=red)
 segments(x0, y0, x0, y1, col=red, lty=dotted)
 ## alternative, down to x axis
 #segments(x0, 0, x0, y1, col=red, lty=dotted)
 
 
 Hope this helps,
 
 Rui Barradas
 maxbre wrote
 Hi all, 
 
 given this example 
 
 #start 
 
 a-c(0,70,50,100,70,650,1300,6900,1780,4930,1120,700,190,940, 
 
 
760,100,300,36270,5610,249680,1760,4040,164890,17230,75140,1870,22380,5890,2430)
 

 length(a)
 
 b-c(0,0,10,30,50,440,1000,140,70,90,60,60,20,90,180,30,90, 
  3220,490,20790,290,740,5350,940,3910,0,640,850,260) 
 length(b)
 
 out-ks.test(log10(a+1),log10(b+1)) 
 
 # max distance D 
 out$statistic 
 
 f.a-ecdf(log10(a+1)) 
 f.b-ecdf(log10(b+1)) 
 
 plot(f.a, verticals=TRUE, do.points=FALSE, col=red) 
 plot(f.b, verticals=TRUE, do.points=FALSE, col=green, add=TRUE) 
 
 #inverse of ecdf a
 x.a-get(x, environment(f.a))
 y.a-get(y, environment(f.a))
 
 # inverse of ecdf b
 x.b-get(x, environment(f.b))
 y.b-get(y, environment(f.b))
 
 
 #end
 
 I want to plot the max distance between the two ecdf curves as in the
 above given chart
 
 Is that possible and how? 
 
 
 Thanks for your help
 
 PS: this is an amended version of a previous thread (but no reply
 followed) that I?ve deleted from Nabble repository because I realised 
it
 was not enough clear (now I hope it?s a little better, sorry for that)





--
View this message in context: 
http://r.789695.n4.nabble.com/Kolmogorov-Smirnov-test-and-the-plot-of-max-distance-between-two-ecdf-curves-tp4631437p4645140.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Imputing data below detection limit

2012-08-13 Thread Brian S Cade
The below detection limit issue is similar to survival analysis with 
censoring (but left rather than right censoring).   So many survival 
estimation approaches are thus appropriate for analyses with below 
detection limits (see NADA package, also censored quantile regression in 
quantreg package, etc).

Brian

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  brian_c...@usgs.gov
tel:  970 226-9326



From:
Bert Gunter gunter.ber...@gene.com
To:
Jessica Streicher j.streic...@micromata.de
Cc:
r-help@r-project.org
Date:
08/13/2012 09:28 AM
Subject:
Re: [R] Imputing data below detection limit
Sent by:
r-help-boun...@r-project.org



Yes, Jessica, the practice -- of which I also have been and continue
to be guilty -- does not really make a lot of sense. It usually
doesn't affect estimation all that much, but it can certainly mess up
inference. The proper approach is to use the proper approach: model it
as left-censored data. The problem with that is:

1. It's complicated, and is beyond the statistical background of most
folks who deal with such data -- it's a ubiquitous issue in science
and engineering after all.

2. Typically, the LOD isn't: that is, there often is not a well
defined value and that which is chosen is both arbitrary and
inaccurate. What one often sees is an increasing loss of relative
precision as one approaches the designated value. Modeling this
effectively gets even more complicated. David Rocke and colleagues has
published methodology on this, mostly in TECHNOMETRICS if memory
serves.

3. So, as in other situations, we muddle along with rather crude
statistical approaches and hope that they are adequate. Probably in
most circumstances they are, but ...

Cheers,
Bert

On Mon, Aug 13, 2012 at 1:15 AM, Jessica Streicher
j.streic...@micromata.de wrote:
 Tempting a use of let me google that for you..

 Anyway, theres a package called Imputation. I myself used the zoo 
package. There are probably lots of others since its a real common 
problem.

 They usually fill in places in you data that are designated as NA.

 I do not completely understand what you mean with detection limit. If 
you do not have NAs, but rather some kind of threshold, i'd suggest going 
over the data and filling any applicable values with NAs, then use the 
library of your choice. I find that kind of weird though, if you haven't 
detected much you haven't detected much. Its part of the data, why impute?

 On 11.08.2012, at 23:01, aynumazi wrote:

 Hello,

 I'm trying to impute data below detection limit (with multiple 
detection
 limits)
 so i need just a method or a code for imputation and then extract the
 complete dataset to do the analyses.
 Is there any package which could do that simply as i'm a beginner in R

 Thank you



 --
 View this message in context: 
http://r.789695.n4.nabble.com/Imputing-data-below-detection-limit-tp4640057.html

 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] quantreg Wald-Test

2012-07-30 Thread Brian S Cade
Stefan:  Your comparison of two models, one with a nested reduced set of 
parameters by fixing the others at zero, with anova() can be used to make 
inference either based on a likelihood ratio form of test or the rankscore 
test for a given quantile (see ?anova.rq and the vignette for literature 
citations with details on the test statistics).   So this comparison 
corresponds to the typical linear model hypothesis of a set of parameters 
equaling zero.  I prefer the rankscore test if sample sizes are not too 
large (e.g., 5,000). 

Brian

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  brian_c...@usgs.gov
tel:  970 226-9326



From:
stefan23 stefan.vo...@uni-konstanz.de
To:
r-help@r-project.org
Date:
07/28/2012 10:32 PM
Subject:
[R] quantreg Wald-Test
Sent by:
r-help-boun...@r-project.org



Dear all,
I know that my question is somewhat special but I tried several times to
solve the problems on my own but I am unfortunately not able to compute 
the
following test statistic using the quantreg package. Well, here we go, I
appreciate every little comment or help as I really do not know how to 
tell
R what I want it to do^^
My situation is as follows: I have a data set containing a (dependent)
vector Y and the regressor X. My aim is to check whether the two variables
do not granger-cause each other in quantiles. I started to compute via
quantreg for a single tau:= q:
rq(Y_t~Y_(t-1)+Y_(t-2)+...+X_(t-1)+X_(t-2)+...,tau=q) 
This gives me the quantile regression coefficients. Now I want to check
whether all the coefficients of X are equal to zero (for this specific 
tau).
Can I do this by applying rq.anova ? I have already asked a similiar
question but I am not sure if anova is really calculating this for me..
Currently I am calculating
fitunrestricted=rq(Y_t~Y_(t-1)+Y_(t-2)+...+X_(t-1)+X_(t-2)+...,tau=q) 
fitrestrited=rq(Y_t~Y_(t-1)+Y_(t-2)+...,tau=q)
anova(fitrestricted,fitunrestricted)
If this is correct can you tell me how the test value is calculated in 
this
case, or in other words:
My next step is going to replicate this procedure for a whole range of
quantiles (say for tau in [a,b]). To apply a sup-Wald-test I am wondering 
if
it is correct to choose the maximum of the different test values and to
simulate the critical values by using the data tabulated in Andrees(1993)
(or simulate the vectors of independent Brownian Motions)...please feel 
free
to comment, I am really looking forward to your help!
Thank you very much 
Cheers
Stefan




--
View this message in context: 
http://r.789695.n4.nabble.com/quantreg-Wald-Test-tp4638198.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Higher log-likelihood in null vs. fitted model

2012-06-01 Thread Brian S Cade
Had several instances of similar issues with weighted leasts squares and 
weighted logistic regression recently.  It is important to remember that 
the log likelihoods for the weighted models include a term that is a 
function of the n weights and that these are not incorporated in deviances 
(or sums of squares for ls regression).

Brian

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  brian_c...@usgs.gov
tel:  970 226-9326



From:
Andrew Miles rstuff.mi...@gmail.com
To:
Mark Leeds marklee...@gmail.com
Cc:
r-help@r-project.org
Date:
05/31/2012 04:09 PM
Subject:
Re: [R] Higher log-likelihood in null vs. fitted model
Sent by:
r-help-boun...@r-project.org



Interesting.  If you use the deviances printed out by the fitted model
(including the null deviance) and back-engineer it to log-likelihoods, you
get results identical to Stata.

 m$deviance*(-.5)

[1] -390.9304

 m$null.deviance*(-.5)

[1] -393.064
However, using the deviance to calculate the AIC by hand does not get you
the AIC that the model returns.

 m$deviance + 2*2

[1] 785.8608

 m$aic

[1] 764.3816
The difference seems to be in how the function glm.fit() calculates the 
AIC
that it returns, and which the function logLik() uses to return the
log-likelihood, as Mark indicated.  The function is:

binomial()$aic

function (y, n, mu, wt, dev)

{

m - if (any(n  1))

n

else wt

-2 * sum(ifelse(m  0, (wt/m), 0) * dbinom(round(m * y),

round(m), mu, log = TRUE))

}
On the other hand, glm() calculates the deviance based on the sum of the
deviance residuals.

This gives me a sense for how log-likelihoods calculated using the 
deviance
and model AIC could differ (i.e., different ways of calculating them), but
it is still not clear to me *why *the two approaches differ.  And they 
only
differ when I use weights with the data.

More importantly, it makes me wonder which is more reliable - in my case,
it seems the deviance residuals approach is more reliable in that a) it
gives a sensible result (i.e., a model with a predictor has a higher
log-likelihood than the null model), and b) it matches an independent
estimation performed in Stata.  But is that always the case? And if so, 
why
use the other approach at all?

On Thu, May 31, 2012 at 9:55 AM, Mark Leeds marklee...@gmail.com wrote:

 Hi Duncan: I don't know if the following can help but I checked the code
 and logLik defines the log likelihood as (p -  glmobject$aic/2) where p 
is
 the glmobject$rank.  So,
 the reason for the likelihood being less is that, in the null, it ends 
up
 being ( 1 - glmobject$aic/2)  and in the other one it ends up being ( 2 
-
 glmobject$aic/2).

 so

  2 - 764.4/2 = -380.2 and

  1 - 761.9/2 = -379.95 ( close enough for govt work )

 So, that's where the #'s are coming from but it really depends on how 
AIC
 is defined.
 Likelihoods should not involve degrees of freedom ( atleast not where 
they
 make
 likelihood less like in the above example ) so maybe backing the
 likelihood out using
 AIC is the issue ?  ( AIC = -2 * likelihood + 2p so   p - AIC/2 =
 likelihood). AIC is a function of the likelihood but , as far as I know,
 likelihood is not a function of the AIC.
 Thanks for any insight.






 On Thu, May 31, 2012 at 9:26 AM, Duncan Murdoch 
murdoch.dun...@gmail.comwrote:

 On 12-05-31 8:53 AM, Andrew Miles wrote:

 Two related questions.

 First, I am fitting a model with a single predictor, and then a null
 model
 with only the intercept.  In theory, the fitted model should have a
 higher
 log-likelihood than the null model, but that does not happen.  See the
 output below.  My first question is, how can this happen?


 I suspect you'll need to give sample data before anyone can really help
 with this.


  m


 Call:  glm(formula = school ~ sv_conform, family = binomial, data = 
dat,
 weights = weight)

 Coefficients:
 (Intercept)   sv_conform
 -2.5430   0.2122

 Degrees of Freedom: 1488 Total (i.e. Null);  1487 Residual
 Null Deviance:786.1
 Residual Deviance: 781.9 AIC: 764.4

 null


 Call:  glm(formula = school ~ 1, family = binomial, data = dat, 
weights =
 weight)

 Coefficients:
 (Intercept)
  -2.532

 Degrees of Freedom: 1488 Total (i.e. Null);  1488 Residual
 Null Deviance:786.1
 Residual Deviance: 786.1 AIC: 761.9

 logLik(m); logLik(null)

 'log Lik.' -380.1908 (df=2)
 'log Lik.' -379.9327 (df=1)



 My second question grows out of the first.  I ran the same two model 
on
 the
 same data in Stata and got identical coefficients.  However, the
 log-likelihoods were different than the one's I got in R, and followed 
my
 expectations - that is, the null model has a lower log-likelihood than
 the
 fitted model.  See the Stata model comparison below.  So my question 
is,
 why do identical models fit in R and Stata have different
 log-likelihoods?


 That's easier:  they use different base measures.  The likelihood is 
only

Re: [R] Function to compute multi-response, multi-rater kappa?

2012-02-02 Thread Brian S Cade
Luk:  Don't know if this solves your desire for an implementation in R, 
but the most general extension of Cohen's kappa for testing agreement that 
I'm aware of are the extensions made by using multi-response randomized 
block permutation procedures  (MRBP) developed by Pual Mielke and Ken 
Berry.  They calculate a generalized measure of agreement that can be 
applied to nominal, ordinal, or continuous data.  I know it is used for 
computing and testing Cohen's kappa for multiple raters with nominal data. 
 But I'm unsure whether it is easily applied to multiple responses at the 
same time as  multiple raters but it might be.  Might check out Mielke and 
Berry (2007.  Permutation Methods, 2nd ed, pp 150-166). We distribute a 
package of permutation software (Blossom) that computes all the 
multiresponse permutation procedure family of statistics including MRBP ( 
we're in the process of porting it to an R package but it will be several 
months before its ready to go).  There are versions of MRPP (less 
complete) available in the vegan package for R, but I don't know whether 
it will do the randomized block variant required for Cohen's kappa. 

Brian

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  brian_c...@usgs.gov
tel:  970 226-9326



From:
Luk Arbuckle luk.arbuc...@gmail.com
To:
David Winsemius dwinsem...@comcast.net
Cc:
r-help@r-project.org
Date:
02/01/2012 08:37 PM
Subject:
Re: [R] Function to compute multi-response, multi-rater kappa?
Sent by:
r-help-boun...@r-project.org



Although interesting, Dave, this doesn't fit my problem.  I want to 
measure
the percentage agreement corrected for chance using an extension of kappa.
 The example data presented in the paper you linked to is considering an
ordinal measure (ranked preference), whereas I'm looking to measure
correlation between a nominal measure (agreement between non-ordered
categories).

The paper by Kraemer is cited over 100 times in Google Scholar, mostly in
the health sciences, so I'm surprised it's not implemented in R.  But I
suppose this is a niche problem (multi-response version of kappa), or that
there is some other extension to kappa, maybe in the social sciences, that
I'm not aware of.

Cheers,

Luk Arbuckle

On Wed, Feb 1, 2012 at 17:13, David Winsemius  wrote:

 Searching on multiple raters attributes at the same site brings up

 http://finzi.psych.upenn.edu/R/library/smacof/doc/smacof.pdf  (by Jan De
 Leeuw)

   Which has as one example multiple raters scoring different 
breads.



 On Feb 1, 2012, at 4:41 PM, Luk Arbuckle wrote:

 Thanks David, but those are not multi-response versions of the kappa.
  Extensions to multiple raters are common and well known.  I am hoping
 someone familiar with multiple response extensions of kappa might see my
 post and be able to help.

 As I said, my search on cran has failed.  I tried all the expected
 keywords, and looked through several kappa functions, but I don't see 
any
 that deal with the multi-response case as I've described it.  Either it
 isn't available in R, or I'm looking in the wrong place.

 I did not intentionally double post, nor try to deceive your efforts to
 block double posting.  I am not receiving my posts, contrary to my
 settings, so I rewrote the first one.  I thought maybe it was blocked 
the
 first time because someone thought it wasn't an R question, so I changed
 the subject.

 Cheers,

 Luk Arbuckle

 On Wed, Feb 1, 2012 at 16:25, David Winsemius wrote:


 On Feb 1, 2012, at 3:13 PM, Luk Arbuckle wrote:

  I'm very sorry for double posting!  My r-help setting Receive your own
 posts to the list? is set to Yes, and Mail delivery is Enabled.  Yet I
 did
 not get a copy of my post (this message is a reply from my sent mail). 
 I
 only learned of the double posting when I found it copied in an r-help
 archive.  Again, my apologies.


 I actually had a chance to prevent that second posting. It looked
 familiar when viewed in the moderation queue and I took a quick look at
 what was in my inbox but since you used a different subject line my 
search
 failed.

 Speaking of searching ... you are asked in the Posting Guide (that no 
one
 reads) to post the specifics of your own efforts. My first search with
 multi-rater kappa failed. My second search is here:

 http://search.r-project.org/**cgi-bin/namazu.cgi?query=**
 multiple+raters+kappamax=100**result=normalsort=score**
 idxname=functionsidxname=**Rhelp08idxname=Rhelp10**idxname=Rhelp02
http://search.r-project.org/cgi-bin/namazu.cgi?query=multiple+raters+kappamax=100result=normalsort=scoreidxname=functionsidxname=Rhelp08idxname=Rhelp10idxname=Rhelp02


 Having gotten more than one apparently on-target result with relatively
 minor effort, I see no point in my expending even more time.

 --
 David.





 Luk Arbuckle

 On Wed, Feb 1, 2012 at 13:47, Luk Arbuckle wrote:

  I'm looking for a function in R that extends kappa

Re: [R] Quantreg-rq crashing trouble

2011-07-21 Thread Brian S Cade
Steve:  I would guess that the problem relates to the large number of tied 
values of -1 in your dependent y variable.  You could randomly jitter 
these y = -1 by  adding a random uniform number between, say, [ -0.01, 
0.01] and see if the rq() converges to a solution.   Then you would know 
that was the numeric computing issue.   Then the question would be what to 
do next?  Seems like a funny data problem with a point mass of responses 
at -1.  Perhaps only higher quantiles, say 0.80, are going to give usable 
estimates.  Perhaps the jittered responses can yield a reasonably 
interpreted estimate.

Brian 

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  brian_c...@usgs.gov
tel:  970 226-9326



From:
Steven R Corsi srco...@usgs.gov
To:
r-help@r-project.org
Date:
07/21/2011 04:04 PM
Subject:
[R] Quantreg-rq crashing trouble
Sent by:
r-help-boun...@r-project.org



Hi

I am using the quantreg package for median regression for a large series 
of subsets of data. It works fabulously for all but one subset. When it 
reaches this subset, R takes the command and never responds. I end up 
having to kill R and restart it.

It appears to be something with the particular data subset, but I can't 
pinpoint the problem.

Here are some details
Operating system: Windows 7
R version: 2.12.1

Here is the data and the rq command that gives me trouble:

library(quantreg)

x - 
c(-0.340778085786686,-0.573639751645382,-0.663932762810308,-0.438591328531796,0.302202380883637,-0.675558868120683,-0.764547425063882,-0.751796238115147,-0.481835451050657,-0.588287304540034,-0.622315341312595,-0.542777491991884,-0.552343921339062,-0.587743299883,-0.758233854317935,-0.783134744819092,-0.97774093234124,0.859832969267456,0.69037126308323,0.185409334523753,-0.432951955490942,-0.988120972598647,0.243223425575187)

y-

c(2.35531739878456,-1,2.26484142532915,-1,2.86895579641295,2.6655997506336,-1,1.33021078457153,-1,-1,-1,1.8263340056,1.60831204269733,-1,2.45313479655685,-1,-1,-1,-1,-1,-1,-1,-1)


fit1 - rq(y ~ x, tau = .5)


Any feedback would be greatly appreciated.
Thanks
Steve

-- 
===
Steven R. CorsiPhone: (608) 821-3835
Research Hydrologist   email: srco...@usgs.gov
U.S. Geological Survey
Wisconsin Water Science Center
8505 Research Way
Middleton, WI 53562

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] quantile regression: out of memory error

2011-07-11 Thread Brian S Cade
Using tau = -1 is causing rq() to try and estimate all possible quantiles 
and store the results.  With 11253 observations this would be a formidable 
feat.   Try estimating the model with say tau = 1:99/100 to give a more 
tractable number of estimates.

Brian

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  brian_c...@usgs.gov
tel:  970 226-9326



From:
Prew, Paul paul.p...@ecolab.com
To:
r-help@R-project.org r-help@r-project.org
Date:
07/11/2011 11:42 AM
Subject:
[R] quantile regression: out of memory error
Sent by:
r-help-boun...@r-project.org



Hello,  I?m wondering if anyone can offer advice on the out-of-memory 
error I?m getting. I?m using R2.12.2 on Windows XP, Platform: 
i386-pc-mingw32/i386 (32-bit).

I am using the quantreg package,  trying to perform a quantile regression 
on a dataframe that has 11,254 rows and 5 columns.

 object.size(subsetAudit.dat)
450832 bytes

 str(subsetAudit.dat)
'data.frame':   11253 obs. of  5 variables:
 $ Satisfaction : num  0.64 0.87 0.78 0.75 0.83 0.75 0.74 0.8 0.89 
0.78 ...
 $ Return   : num  0.84 0.92 0.91 0.89 0.95 0.81 0.9 0.87 0.95 
0.88 ...
 $ Recommend: num  0.53 0.64 0.58 0.58 0.62 0.6 0.56 0.7 0.64 0.65 
...
 $ Cust.Clean   : num  0.75 0.85 0.72 0.72 0.81 0.79 0.79 0.8 0.78 
0.75 ...
 $ CleanScore.Cycle1: num  96.7 83.3 93.3 86.7 96.7 96.7 90 80 81.7 86.7 
...

rq(subsetAudit.dat$Satisfaction ~ subsetAudit.dat$CleanScore.Cycle1, tau = 
-1)

ERROR:  cannot allocate vector of size 2.8 Gb

I don?t know much about computers ? software, hardware, algorithms ? but 
does this mean that the quantreg  package is creating some massive 
intermediate vector as it performs the rq function?   Quantile regression 
is something I?m just starting to explore, but believe it involves 
ordering data prior to the regression, which could be a huge job when 
using 11,000 records.   Does bigmemory have functionality to help me with 
this?

Thank you,
Paul






Paul Prew   ?  Statistician
651-795-5942   ?   fax 651-204-7504
Ecolab Research Center   ?  Mail Stop ESC-F4412-A
655 Lone Oak Drive   ?   Eagan, MN 55121-1560




CONFIDENTIALITY NOTICE: \ This e-mail communication an...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fisher exact for 2x2 table

2011-04-29 Thread Brian S Cade
Rob:   Fisher's exact test is conceptually possible for any r x c 
contingency table problem and uses the observed multinomial table 
probability as the test statistic.   Other tests for r x c contingency 
tables use a different test statistic (Chi-squared, likelihood ratio, 
Zelterman's).  It is possible that the probabilities for any of these 
procedures may differ slightly for the same table configuration even if 
the probabilities for each test are calculated by enumerating all possible 
permutations (hypergeometric) under the null hypothesis.   See Mielke and 
Berry 2007 (Permutation Methods:  A distance function approach) Chps 6 
and7.   Mielke has provided efficient Fortran algorithms for enumerating 
the exact probabilities for 2x2, 3x2, 4x2, 5x2, 6x2 ,3x3,and even 2x2x2 
tables for Fisher's exact and Chi-square statistics.   I don't remember 
whether Cyrus Meta's algorithms for Fisher's exact can do more.But the 
important point to keep in mind is that it is possible to use different 
statistics for evaluating the same null hypothesis for r x c tables 
(Fisher's exact uses one form, Chi-square uses another, etc.) and the 
probabilities can be computed by exact enumeration of all permutations 
(what people expect Fisher's exact to do but also possible for Chi-square 
statistic) or by some approximation (asymptotic distribution, Monte Carlo 
resampling).  The complete enumeration of test statistics under the null 
becomes computationally intractable for large dimension r x c problems 
whether using the observed table probability (like Fisher's exact) as a 
test statistic or other like Chi-square statistic.

So in short, yes you can use Fisher's exact on your 4 x 2 problem, and the 
result might differ from using a Chi-square statistic even if you compute 
the P-value for the Chi-square test by complete enumeration.   Note that 
the minimum expected cell size for the Chi-square test is related to 
whether the Chi-square distributional approximation (an asymptotic 
argument) for evaluating the Chi-square statistic will be reasonable and 
is irrelevant if  you calculate your probabilities by exact enumeration of 
all permutations.

Brian
 

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  brian_c...@usgs.gov
tel:  970 226-9326



From:
viostorm rob.sch...@gmail.com
To:
r-help@r-project.org
Date:
04/29/2011 01:23 PM
Subject:
Re: [R] fisher exact for  2x2 table
Sent by:
r-help-boun...@r-project.org




After I shared comments form the forum yesterday with the biostatistician 
he
indicated this:

Fisher's exact test is the non-parametric analog for the Chi-square 
test for 2x2 comparisons. A version (or extension) of the Fisher's Exact 
test, known as the Freeman-Halton test applies to comparisons for tables 
greater than 2x2. SAS can calculate both statistics using the following 
instructions.

  proc freq; tables a * b / fisher;

Do people here still stand by position fisher exact test can be used for 
RxC
contingency tables ?  Sorry to both you all so much it is just important 
for
a paper I am writing and planning to submit soon. ( I have a 4x2 table but
does not meet expected frequencies requirements for chi-squared.)

I guess people here have suggested R implements, the following, which
unfortunately are unavailable at least easily at my library but  at least 
by
the titles indicates it is extending it to RxC 

Mehta CR, Patel NR. A network algorithm for performing Fisher's exact test
in r c contingency tables. Journal of the American Statistical Association
1983;78:427-34.
 
Mehta CR, Patel NR. Algorithm 643: FEXACT: A FORTRAN subroutine for 
Fisher's
exact test on unordered r x c contingency tables. ACM Transactions on
Mathematical Software 1986;12:154-61.

The only reason I ask again is he is exceptionally clear on this point.

Thanks again, 

-Rob



viostorm wrote:
 
 Thank you all very kindly for your help.
 
 -Rob
 
 
 Robert Schutt III, MD, MCS 
 Resident - Department of Internal Medicine
 University of Virginia, Charlottesville, Virginia
 

viostorm wrote:
 
 Thank you all very kindly for your help.
 
 -Rob
 
 
 Robert Schutt III, MD, MCS 
 Resident - Department of Internal Medicine
 University of Virginia, Charlottesville, Virginia
 


--
View this message in context: 
http://r.789695.n4.nabble.com/fisher-exact-for-2x2-table-tp3481979p3484009.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo

Re: [R] Comparing non nested models with correlation coefficients (inspired from Lorch and Myers )

2011-03-23 Thread Brian S Cade
As a follow-up to Greg's suggested graphical presentation, it seems like 
the Vuong test is sometimes used to compare fits of non nested models.

Brian

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  brian_c...@usgs.gov
tel:  970 226-9326



From:
Greg Snow greg.s...@imail.org
To:
Boris New new.bo...@gmail.com, r-help@r-project.org 
r-help@r-project.org
Date:
03/23/2011 01:14 PM
Subject:
Re: [R] Comparing non nested models with correlation coefficients 
(inspired from Lorch and Myers )
Sent by:
r-help-boun...@r-project.org



If you are interested in the fits, then I would just plot the fits.  Plot 
the fitted/predicted values from the 1st model as the x-values and the 
fitted/predicted values from the second model as the y-values.  It is best 
to plot on a square plotting region and use asp=1, probably also doing 
abline(0,1) after.  This gives a nice visual of how the predictions from 
the models compare.  You could take it a step further by taking the 2 sets 
of predictions and doing a Tukey mean-difference plot, or a Bland-Altman 
plot.

I am not sure that the test you suggest would be meaningful.  What null 
hypothesis is it testing and what does it mean if it is not rejected? 
(what is the power to reject?).  Do you really need a formal test?  If not 
then the plot is probably more meaningful.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Boris New
 Sent: Wednesday, March 23, 2011 7:32 AM
 To: r-help@r-project.org
 Subject: [R] Comparing non nested models with correlation coefficients
 (inspired from Lorch and Myers )
 
 Hi,
 
 I would like to compare two models in R with the same dependant
 variable but
 different predictors (two different types of frequency and reaction
 times as
 RT).
 An editor told me to have a look at Lorch and Myers 1990.
 
 Lorch and Myers use the following technique:
 1) they perform regressions on individual subjects' data
 2) they extract the beta weights
 3) they run t-test on these beta weights.
 
 The point is that I don't want to compare the size effect from the
 different models but how well the two models fit. So my idea was to
 extract
 the correlation coefficients instead of betas and doing t-tests on
 these.
 I checked and my correlation coefficients are normally distributed...
 
 Is it ok to do that?
 
 Best regards,
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] standard error for predicted mean count from ZIP

2010-07-30 Thread Brian S Cade
Does anyone have code for computing the standard error of the predicted 
mean count from a zero-inflated Poisson regression model estimated by the 
zeroinfl() function from the pscl package (and yes, we've checked with A. 
Z. already)?

Thank you

Brian

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  brian_c...@usgs.gov
tel:  970 226-9326
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] weighted.median function from package R.basic

2010-03-30 Thread Brian S Cade
While perhaps not the solution you were looking for, you might consider 
estimating weighted medians with linear quantile regression (just specify 
an intercept for single sample analysis, tau=0.50, and weights = your 
weights) in the quantreg package. Quantile regression does not require 
sorting to estimate medians (minimizes and objective function)  and thus 
might require less computing time on a large data set. 

Brian

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  brian_c...@usgs.gov
tel:  970 226-9326



From:
Joris Meys jorism...@gmail.com
To:
R mailing list r-help@r-project.org
Date:
03/30/2010 10:39 AM
Subject:
[R] weighted.median function from package R.basic
Sent by:
r-help-boun...@r-project.org



Dear all,

I want to apply a weighted median on a huge dataset, and I remember a
function from the package R.basic that could do this using an internal
sorting algorithm qsort. This speeded things up quite a bit. Alas, I can't
find that package anywhere anymore. There is a weighted.median function in
the package limma too, but I didn't use that before.

Anybody who knows what happened to  R.basic?

Cheers
Joris

-- 
Joris Meys
Statistical Consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

Coupure Links 653
B-9000 Gent

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] opposite estimates from zeroinfl() and hurdle()

2009-10-23 Thread Brian S Cade
Tord:   The logistic zero-inflation portion of the zeroinfl() 
implementation of ZIP or ZINB predict the probability of 0 rather than the 
probability of 1 (0 counts) so the signs of the coefficients are often 
reversed from how you would expect them to be if you had just performed a 
logistic regression.  I'm guessing that the hurdle model as a two-stage 
model is using a logistic regression predicting the probability of 1, 
hence the reversed signs of the estimates in the logistic regression 
portion of the model.

Brian


Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  brian_c...@usgs.gov
tel:  970 226-9326



From:
Tord Snäll tord.sn...@ekol.slu.se
To:
r-help@r-project.org
Date:
10/23/2009 07:40 AM
Subject:
[R] opposite estimates from zeroinfl() and hurdle()
Sent by:
r-help-boun...@r-project.org



Dear all,
A question related to the following has been asked on R-help before, but 
I could not find any answer to it. Input will be much appreciated.

I got an unexpected sign of the slope parameter associated with a 
covariate (diam) using zeroinfl(). It led me to compare the estimates 
given by zeroinfl() and hurdle():

The (significant) negative estimate here is surprising, given the 
biology of the species:

  summary(zeroinfl(bnl ~ 1| diam, dist = poisson, data = valdaekar, 
EM = TRUE))
Count model coefficients (poisson with log link):
   Estimate Std. Error z value Pr(|z|)   (Intercept) 
3.746040.02635   142.2   2e-16 ***

Zero-inflation model coefficients (binomial with logit link):
   Estimate Std. Error z value Pr(|z|)  (Intercept) 
21.7510 7.6525   2.842  0.00448 **
diam -1.1437 0.3941  -2.902  0.00371 **

Number of iterations in BFGS optimization: 1
Log-likelihood: -582.8 on 3 Df


The hurdle model gives the same estimates, but with opposite (and 
expected) signs of the parameters:

summary(hurdle(bnl ~ 1| diam, dist = poisson, data = valdaekar))
Count model coefficients (truncated poisson with log link):
   Estimate Std. Error z value Pr(|z|)   (Intercept) 
3.746040.02635   142.2   2e-16 ***
Zero hurdle model coefficients (binomial with logit link):
   Estimate Std. Error z value Pr(|z|)  (Intercept) 
-21.7510 7.6525  -2.842  0.00448 **
diam  1.1437 0.3941   2.902  0.00371 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Number of iterations in BFGS optimization: 8
Log-likelihood: -582.8 on 3 Df

Why is this so?

thanks,
Tord
Windows NT, R 2.8.1, pcsl 1.03

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Quantile GAM?

2009-06-01 Thread Brian S Cade
There are possibilities with rqss() as someone else mentioned.  But you 
can also conduct a lot of useful modeling just by using b-splines within 
the the rq function - something like
my.result - rq(y ~ bs(x,degree=3)), where bs() is the b-spline function 
from the splines package.  You get to specifiy the degree of polynomial 
and number and location of knots.

Brian


Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  brian_c...@usgs.gov
tel:  970 226-9326



From:
Jonathan Greenberg greenb...@ucdavis.edu
To:
r-help@r-project.org r-help@r-project.org
Date:
05/29/2009 05:55 PM
Subject:
[R] Quantile GAM?
Sent by:
r-help-boun...@r-project.org



R-ers:

I was wondering if anyone had suggestions on how to implement a GAM 
in a quantile fashion?  I'm trying to derive a model of a hull of 
points which are likely to require higher-order polynomial fitting (e.g. 
splines)-- would quantreg be sufficient, if the response and predictors 
are all continuous?  Thanks!

--j

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Goodness of fit in quantile regression

2009-05-22 Thread Brian S Cade
Laura:  Part of the issue may depend on what you mean by goodness-of-ft. 
If you are looking for some global measure like a pseudo R or AIC to 
select among models, you ought to be able to make those calculations off 
the objective function that was minimized as you recognized.  If 
qr.fit.sfn() does not return the objective function like rq.fit.br(), the 
simplex routine, you still ought to be able to do the calculations by 
performing the asymmetric weighting of the residuals from the model (see 
Roger Koenker's 2005 book).  Now, if by goodness-of-fit you mean how the 
model fits in local regions of the predictor space, then you might want to 
check out Stef van Buuren's work on worm plots to diagnose fit in quantile 
regression.  Don't remember where he published this but his email is 
stef.vanbuu...@tno.nl

Brian

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  brian_c...@usgs.gov
tel:  970 226-9326



From:
laura m. mayorala...@gmail.com
To:
r-help@r-project.org
Date:
05/22/2009 03:29 AM
Subject:
[R]  Goodness of fit in quantile regression
Sent by:
r-help-boun...@r-project.org




Dear R users,


I've used the function qr.fit.sfn to estimate a quantile regression on a
panel data set. Now I would like to compute an statistic to measure the
goodness of fit of this model.

Does someone know how could I do that? I could compute a pseudo R2 but in
order to do that I would need the value of the objetive function at the
optimum and I don't see how to get this from the function qr.fit.sfn.

If someone has any good idea about how to solve this problem I would be 
most
grateful!

Best

Laura
-- 
View this message in context: 
http://www.nabble.com/Goodness-of-fit-in-quantile-regression-tp23666962p23666962.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] weighted quantiles

2008-10-07 Thread Brian S Cade
Try the linear quantile regression function rq() in the quantreg package. 
For 1 sample estimates, your model would have just an intercept term. 
There is a weight argument.

quantiles.out - rq(y ~ 1, data=mydata, tau=1:9/10, weight=myweights)

would yield the 0.10, 0.20, ..., 0.80, 0.90 weighted quantile estimates.

Brian

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  [EMAIL PROTECTED]
tel:  970 226-9326



sj [EMAIL PROTECTED] 
Sent by: [EMAIL PROTECTED]
10/07/2008 12:38 PM

To
r-help [EMAIL PROTECTED]
cc

Subject
[R] weighted quantiles






I have a set of values and their corresponding weights. I can use the
function weighted.mean to calculate the weighted mean, I would like to be
able to similarly calculate the weighted median and quantiles? Is there a
function in R that can do this?

thanks,


Spencer

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Quantile regression with complex survey data

2008-08-22 Thread Brian S Cade
Just as a follow-up on T. Lumley's post, 2 citations that may be useful in 
reference to application of quantile regression with survey samples are:

Bassett and Saleh.  1994.  L_1 estimation of the median of a survey 
population.  Nonparametric Statistics 3: 277-283.  (L_1 estimation of 
median is 0.50 quantile regression).

Bassett et al.  2002.  Quantile models and estimators for data analysis. 
Metrika 55: 17-26.  (describes weighted QR for survey of school 
characteristics and student achievement scores).

Brian

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  [EMAIL PROTECTED]
tel:  970 226-9326



Stas Kolenikov [EMAIL PROTECTED] 
Sent by: [EMAIL PROTECTED]
08/20/2008 01:14 PM

To
Cheng, Yiling (CDC/CCHP/NCCDPHP) [EMAIL PROTECTED]
cc
r-help@r-project.org
Subject
Re: [R] Quantile regression with complex survey data






On Wed, Aug 20, 2008 at 8:12 AM, Cheng, Yiling (CDC/CCHP/NCCDPHP)
[EMAIL PROTECTED] wrote:
 I am working on the NHANES survey data, and want to apply quantile
 regression on these complex survey data. Does anyone know how to do
 this?

There are no references in technical literature (thinking, Annals,
JASA, JRSS B, Survey Methodology). Absolutely none. Zero. You might be
able to apply the procedure mechanically and then adjust the standard
errors, but God only knows what the population equivalent is of
whatever that model estimates. If there is a population analogue at
all.

In general, a quantile regression is a heavily model based concept:
for each value of the explanatory variables, there is a well defined
distribution of the response, and quantile regression puts additional
structure on it -- linearity of quantiles wrt to some explanatory
variables. That does not mesh well with the design paradigm according
to which the survey estimation is usually conducted. With the latter,
the finite population and characteristics of every unit are assumed
fixed, and randomness comes only from the sampling procedure. Within
that paradigm, you can define the marginal distribution of the
response (or any other) variable, but the conditional distributions
may simply be unavailable because there are no units in the population
satisfying the conditions.

-- 
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: I use this email account for mailing lists only.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] weights in lmer

2008-06-16 Thread Brian S Cade
I originally sent this to Doug Bates but have received no reply yet so I 
thought I would expand to a wider source.

I've been trying to estimate linear mixed effect models in lmer() from the 
lme4 package using the weights option.  The help and code for lmer() 
suggest to me that this is implemented but I can't seem to get it to do 
anything with weights = , no error message reported it just seems to 
ignore the weights option as my weighted and unweigted models are 
identical in all summary statistics.  Below is an example of my model 
code.   The weights are lengths of observational bouts on wild horse 
behavior.   So is weights not really yet implemented or am I missing 
something? 


m2.wt- lmer(HD_RATE ~ NUM_PAIR + 
(NUM_PAIR|HMA),data=out2.5,weights=HOURS/max(HOURS), 
contrasts=list(NUM_PAIR=contr.treatment),family=gaussian)

Brian

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  [EMAIL PROTECTED]
tel:  970 226-9326
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] categorical data analysis - fisher.exact for 2x2 and greater

2008-05-07 Thread Brian S Cade
Jeff and others:

Paul Mielke and Ken Berry at Colorado State University developed very fast 
Fortran routines for doing Fisher exact tests (as well as some others like 
Chi-square and Zelterman's statistics).   Their book (Permutation Methods: 
A Distance Function Approach, 2nd ed.  2007. Springer) indicates that 
these algorithms were only efficient for = 5 conditional loops so 2x2, 
3x2, 4x2, 5x2, 6x2, 3x3, and 2x2x2 were feasible back in the 1990's.   I'm 
not sure whether more recent computing capability permits larger tables 
now. 

Brian

Brian S. Cade

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  [EMAIL PROTECTED]
tel:  970 226-9326



Dr. Jeff Miller [EMAIL PROTECTED] 
Sent by: [EMAIL PROTECTED]
05/07/2008 09:56 AM

To
'Greg Snow' [EMAIL PROTECTED], 'David Winsemius' 
[EMAIL PROTECTED], [EMAIL PROTECTED]
cc

Subject
Re: [R] categorical data analysis - fisher.exact for 2x2 and greater






Hi Simon and all,

I'm pretty sure that you are correct about this. I think it is a
misconception to say that the fisher exact test is only for a 2 by 2 
table.
It is presented that way in textbooks because, for a 2x2 table, it is easy
to perform.  For larger tables, it becomes complex quickly due to the rate
at which the permutations increase.

 When I use it for larger tables, it is hit or miss as to whether R will 
be
able to do it.  It's not uncommon to get an error implying that R is out 
of
memory. 

Check out Agresti's Categorical Data Analysis on the use of the fisher 
exact
for larger than 2x2 tables, and check out the R archives (and Google 
search)
for all the posts about the error message people run into sometimes.

On April 29th, Marc Schwartz replied to my question about this as follows:

Take a look at the 'workspace' argument in ?fisher.test and review the
second paragraph in Details:

For 2 by 2 cases, p-values are obtained directly using the (central or
non-central) hypergeometric distribution. Otherwise, computations are 
based
on a C version of the FORTRAN subroutine FEXACT which implements the 
network
developed by Mehta and Patel (1986) and improved by Clarkson, Fan and Joe
(1993). The FORTRAN code can be obtained from
http://www.netlib.org/toms/643. Note this fails (with an error message) 
when
the entries of the table are too large. (It transposes the table if
necessary so it has no more rows than columns. One constraint is that the
product of the row marginals be less than 2^31 - 1.)



Thank you,
Jeff



-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
On
Behalf Of Greg Snow
Sent: Wednesday, May 07, 2008 9:55 AM
To: David Winsemius; [EMAIL PROTECTED]
Subject: Re: [R] categorical data analysis

The last example in ?fisher.test is not a 2x2 table, in fact it uses 
levels
with a natural ordering similar to the original question.  Why would this
not be applicable to the situation?


From: [EMAIL PROTECTED] [EMAIL PROTECTED] On 
Behalf
Of David Winsemius [EMAIL PROTECTED]
Sent: Wednesday, May 07, 2008 7:34 AM
To: [EMAIL PROTECTED]
Subject: Re: [R] categorical data analysis

Simon Blomberg [EMAIL PROTECTED] wrote in
news:[EMAIL PROTECTED]:

 But see these posts:

 http://finzi.psych.upenn.edu/R/Rhelp02a/archive/119079.html

 http://finzi.psych.upenn.edu/R/Rhelp02a/archive/119080.html

 Simon.

Interesting reading, but the OP specifically said he was not dealing with
2x2 tables, so neither fisher.test nor the suggested alternatives would
be applicable to his data situation.

--
David Winsemius

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Internal Virus Database is out-of-date.


11:27 AM
 

Internal Virus Database is out-of-date.


11:27 AM

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] graphics - line resolution/pixelation going from R to windows metafile

2007-11-16 Thread Brian S Cade
I have a recurring graphics issue that I've not been able to resolve with 
R.  If I make a series of regression estimates and then plot the estimated 
function for the regression lines over a scatter plot of the data, e.g., 
using a sequence of plot( ) and lines ( ) similar to those below

plot(dep12spp13ph1$DAYSWETm2,dep12spp13ph1$AFTERDECOMP,pch=dep12spp13ph1$spp,cex=0.75,ylab=Proportion
 
of biomass after leaching, xlab=Number of days in 
wetland,xlim=c(0,250),ylim=c(0,1))
xplot- 0:243
xplotbsk64deg1-bs(xplot,knot=62,degree=1,Boundary.knots=c(0,243))
lines(xplot,exp(xplotbsk64deg1 %*% 
ph1.decomp.75.bs$coef[c(1,2)]),col=blue)
lines(xplot,exp(xplotbsk64deg1 %*% 
ph1.decomp.75.bs$coef[c(3,4)]),col=red)

and then attempt to copy the resulting graph from R graph window into a 
Windows metafile to paste into another application like Power Point or 
Word, the resulting regression lines have alot of jagged edges due to 
resolution/pixelation issues.  For a nonlinear function like the one 
plotted above, changing the number of pairs of points evaluated with the 
lines( ) function makes the plot look better or worse but never as good as 
is possible if there was some sort of smoothing done on the plotted line 
pixels.  I do this type of graphing in SYSTAT all the time and it looks 
great (but I would prefer not to have to jump back and forth between R and 
SYSTAT ).   If I save the graph (or print) from R into an encapsulated 
Post-Script file, then the resulting regression line pixels are smoothed 
out and look nice, but then the symbols have been decomposed into their 
constituent elements and are screwed up.  I'm guessing somewhere in R's 
graphing parameters/controls there might be a solution but I've yet to 
find it.  Any suggestions would be welcome. 

Brian
 
Brian S. Cade

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  [EMAIL PROTECTED]
tel:  970 226-9326
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.