[R] glm in hdlm?
I am following the example in the vignette for hdlm (p. 19), but I cannot get it to to fit a logistic. For those who don't know the package, it allows one to fit high dimensional data where the number of variables may exceed the number of cases. library(hdlm) LMFUN - function(x,y) return(glm(y ~ x, family=binomial(link=logit))) FUNCVFIT - function(x,y) return(cv.glmnet(x, y, family='binomial')) set.seed(1234) xx-matrix(runif(20*4),20,4) #20 cases, 4 variables xx[,1]-xx[,1]+1:20 yy-c(0,0,0,1,0,1,1,0,1,0,1,0,1,0,1,1,1,1,1,1) #ordinary glms are fitted with no problems with yy either factor or numeric fit1-glm(as.factor(yy)~xx,family=binomial) fit2-glm(yy~xx,family=binomial) fit3-hdlm(as.factor(yy) ~ xx, LMFUN = LMFUN, FUNCVFIT = FUNCVFIT) This produces the error: Error in { : task 1 failed - (list) object cannot be coerced to type 'double' In addition: There were 11 warnings (use warnings() to see them) = fit4-hdlm(yy ~ xx, LMFUN = LMFUN, FUNCVFIT = FUNCVFIT) This produces: Error in { : task 1 failed - (list) object cannot be coerced to type 'double' In addition: Warning messages: 1: Option grouped=FALSE enforced in cv.glmnet, since 3 observations per fold 2: Option grouped=FALSE enforced in cv.glmnet, since 3 observations per fold 3: Option grouped=FALSE enforced in cv.glmnet, since 3 observations per fold 4: Option grouped=FALSE enforced in cv.glmnet, since 3 observations per fold 5: Option grouped=FALSE enforced in cv.glmnet, since 3 observations per fold 6: Option grouped=FALSE enforced in cv.glmnet, since 3 observations per fold 7: Option grouped=FALSE enforced in cv.glmnet, since 3 observations per fold 8: Option grouped=FALSE enforced in cv.glmnet, since 3 observations per fold 9: Option grouped=FALSE enforced in cv.glmnet, since 3 observations per fold 10: Option grouped=FALSE enforced in cv.glmnet, since 3 observations per fold = Please tell me how to fit the glm in hdlm. Thanks very much for any help. Stan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] using poly() to predict
Thanks Prof Ripley. For anybody else wondering about this, see: http://stackoverflow.com/questions/26728289/extracting-orthogonal-polynomial-coefficients-from-rs-poly-function = The polynomials are defined recursively using the alpha and norm2 coefficients of the poly object you've created. Let's look at an example: z - poly(1:10, 3) attributes(z)$coefs# $alpha# [1] 5.5 5.5 5.5# $norm2# [1]1.0 10.0 82.5 528.0 3088.8 For notation, let's call a_d the element in index d of alpha and let's call n_d the element in index d of norm2. F_d(x) will be the orthogonal polynomial of degree d that is generated. For some base cases we have: F_0(x) = 1 / sqrt(n_2) F_1(x) = (x-a_1) / sqrt(n_3) The rest of the polynomials are recursively defined: F_d(x) = [(x-a_d) * sqrt(n_{d+1}) * F_{d-1}(x) - n_{d+1} / sqrt(n_d) * F_{d-2}(x)] / sqrt(n_{d+2}) To confirm with x=2.1: x - 2.1 predict(z, newdata=x)# 1 2 3# [1,] -0.3743277 0.1440493 0.1890351# ... a - attributes(z)$coefs$alpha n - attributes(z)$coefs$norm2 f0 - 1 / sqrt(n[2])(f1 - (x-a[1]) / sqrt(n[3]))# [1] -0.3743277(f2 - ((x-a[2]) * sqrt(n[3]) * f1 - n[3] / sqrt(n[2]) * f0) / sqrt(n[4]))# [1] 0.1440493(f3 - ((x-a[3]) * sqrt(n[4]) * f2 - n[4] / sqrt(n[3]) * f1) / sqrt(n[5]))# [1] 0.1890351 The most compact way to export your polynomials to your C++ code would probably be to export attributes(z)$coefs$alpha and attributes(z)$coefs$norm2 and then use the recursive formula in C++ to evaluate your polynomials. On Wed, Jan 14, 2015 at 2:38 PM, Prof Brian Ripley rip...@stats.ox.ac.uk wrote: On 14/01/2015 14:20, Stanislav Aggerwal wrote: This method of finding yhat as x %*% b works when I use raw polynomials: x-1:8 y- 1+ 1*x + .5*x^2 fit-lm(y~poly(x,2,raw=T)) b-coef(fit) xfit-seq(min(x),max(x),length=20) yfit-b[1] + poly(xfit,2,raw=T) %*% b[-1] plot(x,y) lines(xfit,yfit) But it doesn't work when I use orthogonal polynomials: fit-lm(y~poly(x,2)) b-coef(fit) yfit-b[1] + poly(xfit,2) %*% b[-1] plot(x,y) lines(xfit,yfit,col='red') I have a feeling that the second version needs to incorporate poly() coefs (alpha and norm2) somehow. If so, please tell me how. I do know how to use predict() for this. I just want to understand how poly() works. What matters is how lm() and predict() use poly(): see ?makepredictcall and its code. str(fit) might also help. Thanks very much for any help Stan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/ posting-guide.html and provide commented, minimal, self-contained, reproducible code. Please do, and do not send HTML. -- Brian D. Ripley, rip...@stats.ox.ac.uk Emeritus Professor of Applied Statistics, University of Oxford 1 South Parks Road, Oxford OX1 3TG, UK [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] using poly() to predict
This method of finding yhat as x %*% b works when I use raw polynomials: x-1:8 y- 1+ 1*x + .5*x^2 fit-lm(y~poly(x,2,raw=T)) b-coef(fit) xfit-seq(min(x),max(x),length=20) yfit-b[1] + poly(xfit,2,raw=T) %*% b[-1] plot(x,y) lines(xfit,yfit) But it doesn't work when I use orthogonal polynomials: fit-lm(y~poly(x,2)) b-coef(fit) yfit-b[1] + poly(xfit,2) %*% b[-1] plot(x,y) lines(xfit,yfit,col='red') I have a feeling that the second version needs to incorporate poly() coefs (alpha and norm2) somehow. If so, please tell me how. I do know how to use predict() for this. I just want to understand how poly() works. Thanks very much for any help Stan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] nonmonotonic glm?
Thanks very much Marc and Ben for the helpful suggestions Stan On Sun, Jan 11, 2015 at 10:28 PM, Ben Bolker bbol...@gmail.com wrote: If you're going to use splines, another possibility is mgcv::gam (also part of standard R installation) require(mgcv) gam(DV ~ s(IV), data= YourDataFrame, family=binomial) this has the advantage that the complexity of the spline is automatically adjusted/selected by the fitting algorithm (although occasionally you need to use s(IV,k=something_bigger) to adjust the default *maximum* complexity chosen by the code) On Sun, Jan 11, 2015 at 5:23 PM, Marc Schwartz marc_schwa...@me.com wrote: On Jan 11, 2015, at 4:00 PM, Ben Bolker bbol...@gmail.com wrote: Stanislav Aggerwal stan.aggerwal at gmail.com writes: I have the following problem. DV is binomial p IV is quantitative variable that goes from negative to positive values. The data look like this (need nonproportional font to view): [snip to make gmane happy] If these data were symmetrical about zero, I could use abs(IV) and do glm(p ~ absIV). I suppose I could fit two glms, one to positive and one to negative IV values. Seems a rather ugly approach. [snip] What's wrong with a GLM with quadratic terms in the predictor variable? This is perfectly respectable, well-defined, and easy to implement: glm(y~poly(x,2),family=binomial,data=...) or y~x+I(x^2) or y~poly(x,2,raw=TRUE) (To complicate things further, this is within-subjects design) glmer, glmmPQL, glmmML, etc. should all support this just fine. As an alternative to Ben's recommendation, consider using a piecewise cubic spline on the IV. This can be done using glm(): # splines is part of the Base R distribution # I am using 'df = 5' below, but this can be adjusted up or down as may be apropos require(splines) glm(DV ~ ns(IV, df = 5), family = binomial, data = YourDataFrame) and as Ben's notes, is more generally supported in mixed models. If this was not mixed model, another logistic regression implementation is in Frank's rms package on CRAN, using his lrm() instead of glm() and rcs() instead of ns(): # after installing rms from CRAN require(rms) lrm(DV ~ rcs(IV, 5), data = YourDataFrame) Regards, Marc Schwartz [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] nonmonotonic glm?
I have the following problem. DV is binomial p IV is quantitative variable that goes from negative to positive values. The data look like this (need nonproportional font to view): o o o o o o o o o --+ 0 If these data were symmetrical about zero, I could use abs(IV) and do glm(p ~ absIV). I suppose I could fit two glms, one to positive and one to negative IV values. Seems a rather ugly approach. (To complicate things further, this is within-subjects design) Any suggestions welcome. Thanks! Bill [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] get mouse position without waiting for a click
Consider the following: par(mar=c(0,0,0,0),xaxs = 'i',yaxs='i') plot.new() for(i in 1:20) { z - matrix(runif(256*256), ncol=256) dev.hold() image(z, col=grey(0:255/255),zlim=c(0,1),useRaster=TRUE) dev.flush() Sys.sleep(.1) } I would like to continuously display the animation until a mouse click happens. Obviously, locator() can't be used for this, because it will halt the animation while waiting for the mouse click. So something like this won't work: while((r-locator(n=1)) == NULL) { z - matrix(runif(256*256), ncol=256) dev.hold() image(z, col=grey(0:255/255),zlim=c(0,1),useRaster=TRUE) dev.flush() Sys.sleep(.1) } Can somebody suggest a way to get user input while displaying the animation? I am willing to use a keypress if not the mouse. Essentially, I need an input routine that does not wait for a response but just checks at one instant (like kbhit in C). Thanks very much for any help. Stan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] get mouse position without waiting for a click
Very good question -- sorry to have left that info out of my posting. windows() Thanks for any help. Bill On Mon, Sep 30, 2013 at 4:46 PM, Bert Gunter gunter.ber...@gene.com wrote: On Windows or X11, (others ???) ?getGraphicsEvent Cheers, Bert On Mon, Sep 30, 2013 at 7:46 AM, Stanislav Aggerwal stan.agger...@gmail.com wrote: Consider the following: par(mar=c(0,0,0,0),xaxs = 'i',yaxs='i') plot.new() for(i in 1:20) { z - matrix(runif(256*256), ncol=256) dev.hold() image(z, col=grey(0:255/255),zlim=c(0,1),useRaster=TRUE) dev.flush() Sys.sleep(.1) } I would like to continuously display the animation until a mouse click happens. Obviously, locator() can't be used for this, because it will halt the animation while waiting for the mouse click. So something like this won't work: while((r-locator(n=1)) == NULL) { z - matrix(runif(256*256), ncol=256) dev.hold() image(z, col=grey(0:255/255),zlim=c(0,1),useRaster=TRUE) dev.flush() Sys.sleep(.1) } Can somebody suggest a way to get user input while displaying the animation? I am willing to use a keypress if not the mouse. Essentially, I need an input routine that does not wait for a response but just checks at one instant (like kbhit in C). Thanks very much for any help. Stan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] get mouse position without waiting for a click
thanks Greg Bill On Monday, September 30, 2013, Greg Snow 538...@gmail.com wrote: Some code that you can look at for examples of capturing the mouse position without clicking include the playSudoku function in the sudoku package and several functions in the TeachingDemos package including HWidentify, HTKidentify, dynIdentify, and TkIdentify. On Mon, Sep 30, 2013 at 8:46 AM, Stanislav Aggerwal stan.agger...@gmail.com javascript:_e({}, 'cvml', 'stan.agger...@gmail.com'); wrote: Consider the following: par(mar=c(0,0,0,0),xaxs = 'i',yaxs='i') plot.new() for(i in 1:20) { z - matrix(runif(256*256), ncol=256) dev.hold() image(z, col=grey(0:255/255),zlim=c(0,1),useRaster=TRUE) dev.flush() Sys.sleep(.1) } I would like to continuously display the animation until a mouse click happens. Obviously, locator() can't be used for this, because it will halt the animation while waiting for the mouse click. So something like this won't work: while((r-locator(n=1)) == NULL) { z - matrix(runif(256*256), ncol=256) dev.hold() image(z, col=grey(0:255/255),zlim=c(0,1),useRaster=TRUE) dev.flush() Sys.sleep(.1) } Can somebody suggest a way to get user input while displaying the animation? I am willing to use a keypress if not the mouse. Essentially, I need an input routine that does not wait for a response but just checks at one instant (like kbhit in C). Thanks very much for any help. Stan [[alternative HTML version deleted]] __ R-help@r-project.org javascript:_e({}, 'cvml', 'R-help@r-project.org');mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com javascript:_e({}, 'cvml', '538...@gmail.com'); [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] repeated measures logistic regression
Thanks very much Ben for your extremely helpful response. I have loads of data so this worked fine. cheers, Stan On Saturday, July 27, 2013, Ben Bolker wrote: Stanislav Aggerwal stan.aggerwal at gmail.com writes: I have searched the r-help archive and saw only one unanswered post related to mine. Take a look at the r-sig-mixed-models (@r-project.org) mailing list and archive ... My design is as follows. - y is Bernoulli response - x1 is continuous variable - x2 is categorical (factor) variable with two levels The experiment is completely within subjects. That is, each subject receives each combination of x1 and x2. This is a repeated measures logistic regression set-up. The experiment will give two ogives for p(y==1) vs x1, one for level1 and one for level2 of x2. The effect of x2 should be that for level2 compared to level1, the ogive should have a shallower slope and increased intercept. I am struggling with finding the model using lme4. Here is a guess at it: glmer(y~x1*x2 +(1|subject),family=binomial) So far as I understand it, the 1|subject part says that subject is a random effect. But I do not really understand the notation or how to specify that x1 and x2 are repeated measures variables. In the end I want a model that includes a random effect for subjects, and gives estimated slopes and intercepts for level1 and level2. I believe you want glmer(y~x1*x2 +(x1*x2|subject),family=binomial,data=...) (I strongly recommend including the data= argument in your call) This will give a population-level estimate of intercept (log-odds in group 1 at x1=0) treatment effect on intercept (log-odds(level2,x1=0)-log-odds(level1,x=0)) log-odds slope in level 1 difference in slopes as well as among-individual variances in all four of these parameters, and covariances among all the parameters (i.e. a 4x4 variance-covariance matrix for these parameters). For binary data and estimating 4 fixed + 10 RE parameters (i.e., variances and covariances), you're going to need a lot of data -- very conservatively, 140 total observations. It may help to center your x1 variable. see http://glmm.wikidot.com/faq (especially http://glmm.wikidot.com/faq#modelspec), and the r-sig-mixed-models mailing list. __ R-help@r-project.org javascript:; mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] repeated measures logistic regression
I have searched the r-help archive and saw only one unanswered post related to mine. My design is as follows. - y is Bernoulli response - x1 is continuous variable - x2 is categorical (factor) variable with two levels The experiment is completely within subjects. That is, each subject receives each combination of x1 and x2. This is a repeated measures logistic regression set-up. The experiment will give two ogives for p(y==1) vs x1, one for level1 and one for level2 of x2. The effect of x2 should be that for level2 compared to level1, the ogive should have a shallower slope and increased intercept. I am struggling with finding the model using lme4. Here is a guess at it: glmer(y~x1*x2 +(1|subject),family=binomial) So far as I understand it, the 1|subject part says that subject is a random effect. But I do not really understand the notation or how to specify that x 1 and x2 are repeated measures variables. In the end I want a model that includes a random effect for subjects, and gives estimated slopes and intercepts for level1 and level2. Thanks very much for any help. Stan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.