Re: [R] Optimisation and NaN Errors using clm() and clmm()
On 18 April 2013 18:38, Thomas Foxley thomasfox...@aol.com wrote: Rune, Thank you very much for your response. I don't actually have the models that failed to converge from the first (glmulti) part as they were not saved with the confidence set. glmulti generates thousands of models so it seems reasonable that a few of these may not converge. The clmm() model I provided was just an example - not all models have 17 parameters. There were only one or two that produced errors (the example I gave being one of them), perhaps overparameterisation is the root of the problem. Regarding incomplete data - there are only 103 (of 314) records where I have data for every predictor. The number of observations included will obviously vary for different models, models with fewer predictors will include more observations. glmulti acts as a wrapper for another function, meaning (in this case) na's are treated as they would be in clm(). Is there a way around this (apart from filling in the missing data)? I believe its possible to limit model complexity in the glmulti call - which may or may not increase the number of observations - how would this affect interpretation of the results? Since the likelihood (and hence also AIC-like criteria) depends on the number of observations, I would make sure that only models with the same number of observations are compared using model selection criteria. This means that I would make a data.frame with complete observations either by just deleting all rows with one or more missing predictors or by imputing some data points. If one or a couple of variables are responsible for most of the missing observations, you could disregard these variables before deleting rows with NAs. As I said, I am no expert in model averaging or glmulti usage, so there might be better approaches or other opinions on this. Cheers, Rune __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Optimisation and NaN Errors using clm() and clmm()
Rune, Thank you very much for your response. I don't actually have the models that failed to converge from the first (glmulti) part as they were not saved with the confidence set. glmulti generates thousands of models so it seems reasonable that a few of these may not converge. The clmm() model I provided was just an example - not all models have 17 parameters. There were only one or two that produced errors (the example I gave being one of them), perhaps overparameterisation is the root of the problem. Regarding incomplete data - there are only 103 (of 314) records where I have data for every predictor. The number of observations included will obviously vary for different models, models with fewer predictors will include more observations. glmulti acts as a wrapper for another function, meaning (in this case) na's are treated as they would be in clm(). Is there a way around this (apart from filling in the missing data)? I believe its possible to limit model complexity in the glmulti call - which may or may not increase the number of observations - how would this affect interpretation of the results? Thanks again, Tom On 16/04/13 07:54, Rune Haubo wrote: On 15 April 2013 13:18, Thomas thomasfox...@aol.com wrote: Dear List, I am using both the clm() and clmm() functions from the R package 'ordinal'. I am fitting an ordinal dependent variable with 5 categories to 9 continuous predictors, all of which have been normalised (mean subtracted then divided by standard deviation), using a probit link function. From this global model I am generating a confidence set of 200 models using clm() and the 'glmulti' R package. This produces these errors: / model.2.10 - glmulti(as.factor(dependent) ~ predictor_1*predictor_2*predictor_3*predictor_4*predictor_5*predictor_6*predictor_7*predictor_8*predictor_9, data = database, fitfunc = clm, link = probit, method = g, crit = aicc, confsetsize = 200, marginality = TRUE) ... After 670 generations: Best model: as.factor(dependent)~1+predictor_1+predictor_2+predictor_3+predictor_4+predictor_5+predictor_6+predictor_8+predictor_9+predictor_4:predictor_3+predictor_6:predictor_2+predictor_8:predictor_5+predictor_9:predictor_1+predictor_9:predictor_4+predictor_9:predictor_5+predictor_9:predictor_6 Crit= 183.716706496392 Mean crit= 202.022138576506 Improvements in best and average IC have bebingo en below the specified goals. Algorithm is declared to have converged. Completed. There were 24 warnings (use warnings() to see them) warnings() Warning messages: 1: optimization failed: step factor reduced below minimum 2: optimization failed: step factor reduced below minimum 3: optimization failed: step factor reduced below minimum/ etc. I am then re-fitting each of the 200 models with the clmm() function, with 2 random factors (family nested within order). I get this error in a few of the re-fitted models: / model.2.glmm.2 - clmm(as.factor(dependent) ~ 1 + predictor_1 + predictor_2 + predictor_3 + predictor_6 + predictor_7 + predictor_8 + predictor_9 + predictor_6:predictor_2 + predictor_7:predictor_2 + predictor_7:predictor_3 + predictor_8:predictor_2 + predictor_9:predictor_1 + predictor_9:predictor_2 + predictor_9:predictor_3 + predictor_9:predictor_6 + predictor_9:predictor_7 + predictor_9:predictor_8+ (1|order/family), link = probit, data = database) summary(model.2.glmm.2) Cumulative Link Mixed Model fitted with the Laplace approximation formula: as.factor(dependent) ~ 1 + predictor_1 + predictor_2 + predictor_3 + predictor_6 + predictor_7 + predictor_8 + predictor_9 + predictor_6:predictor_2 + predictor_7:predictor_2 + predictor_7:predictor_3 + predictor_8:predictor_2 + predictor_9:predictor_1 + predictor_9:predictor_2 + predictor_9:predictor_3 + predictor_9:predictor_6 + predictor_9:predictor_7 + predictor_9:predictor_8 + (1 | order/family) data: database link threshold nobs logLik AIC niter max.grad cond.H probit flexible 103 -65.56 173.13 58(3225) 8.13e-06 4.3e+03 Random effects: Var Std.Dev family:order 7.493e-11 8.656e-06 order 1.917e-12 1.385e-06 Number of groups: family:order 12, order 4 Coefficients: Estimate Std. Error z value Pr(|z|) predictor_1 0.40802 0.78685 0.519 0.6041 predictor_2 0.02431 0.26570 0.092 0.9271 predictor_3 -0.84486 0.32056 -2.636 0.0084 ** predictor_6 0.65392 0.34348 1.904 0.0569 . predictor_7 0.71730 0.29596 2.424 0.0154 * predictor_8 -1.37692 0.75660 -1.820 0.0688 . predictor_9 0.15642 0.28969 0.540 0.5892 predictor_2:predictor_6 -0.46880 0.18829 -2.490 0.0128 * predictor_2:predictor_7 4.97365 0.82692 6.015 1.80e-09 *** predictor_3:predictor_7 -1.13192 0.46639 -2.427 0.0152 * predictor_2:predictor_8 -5.52913 0.88476 -6.249 4.12e-10 *** predictor_1:predictor_9 4.28519 NA NA NA predictor_2:predictor_9 -0.26558 0.10541 -2.520 0.0117 * predictor_3:predictor_9 -1.49790 NA NA NA predictor_6:predictor_9 -1.31538 NA NA NA predictor_7:predictor_9 -4.41998 NA NA NA predictor_8:predictor_9 3.99709 NA
Re: [R] Optimisation and NaN Errors using clm() and clmm()
On 15 April 2013 13:18, Thomas thomasfox...@aol.com wrote: Dear List, I am using both the clm() and clmm() functions from the R package 'ordinal'. I am fitting an ordinal dependent variable with 5 categories to 9 continuous predictors, all of which have been normalised (mean subtracted then divided by standard deviation), using a probit link function. From this global model I am generating a confidence set of 200 models using clm() and the 'glmulti' R package. This produces these errors: / model.2.10 - glmulti(as.factor(dependent) ~ predictor_1*predictor_2*predictor_3*predictor_4*predictor_5*predictor_6*predictor_7*predictor_8*predictor_9, data = database, fitfunc = clm, link = probit, method = g, crit = aicc, confsetsize = 200, marginality = TRUE) ... After 670 generations: Best model: as.factor(dependent)~1+predictor_1+predictor_2+predictor_3+predictor_4+predictor_5+predictor_6+predictor_8+predictor_9+predictor_4:predictor_3+predictor_6:predictor_2+predictor_8:predictor_5+predictor_9:predictor_1+predictor_9:predictor_4+predictor_9:predictor_5+predictor_9:predictor_6 Crit= 183.716706496392 Mean crit= 202.022138576506 Improvements in best and average IC have bebingo en below the specified goals. Algorithm is declared to have converged. Completed. There were 24 warnings (use warnings() to see them) warnings() Warning messages: 1: optimization failed: step factor reduced below minimum 2: optimization failed: step factor reduced below minimum 3: optimization failed: step factor reduced below minimum/ etc. I am then re-fitting each of the 200 models with the clmm() function, with 2 random factors (family nested within order). I get this error in a few of the re-fitted models: / model.2.glmm.2 - clmm(as.factor(dependent) ~ 1 + predictor_1 + predictor_2 + predictor_3 + predictor_6 + predictor_7 + predictor_8 + predictor_9 + predictor_6:predictor_2 + predictor_7:predictor_2 + predictor_7:predictor_3 + predictor_8:predictor_2 + predictor_9:predictor_1 + predictor_9:predictor_2 + predictor_9:predictor_3 + predictor_9:predictor_6 + predictor_9:predictor_7 + predictor_9:predictor_8+ (1|order/family), link = probit, data = database) summary(model.2.glmm.2) Cumulative Link Mixed Model fitted with the Laplace approximation formula: as.factor(dependent) ~ 1 + predictor_1 + predictor_2 + predictor_3 + predictor_6 + predictor_7 + predictor_8 + predictor_9 + predictor_6:predictor_2 + predictor_7:predictor_2 + predictor_7:predictor_3 + predictor_8:predictor_2 + predictor_9:predictor_1 + predictor_9:predictor_2 + predictor_9:predictor_3 + predictor_9:predictor_6 + predictor_9:predictor_7 + predictor_9:predictor_8 + (1 | order/family) data: database link threshold nobs logLik AIC niter max.grad cond.H probit flexible 103 -65.56 173.13 58(3225) 8.13e-06 4.3e+03 Random effects: Var Std.Dev family:order 7.493e-11 8.656e-06 order 1.917e-12 1.385e-06 Number of groups: family:order 12, order 4 Coefficients: Estimate Std. Error z value Pr(|z|) predictor_1 0.40802 0.78685 0.519 0.6041 predictor_2 0.02431 0.26570 0.092 0.9271 predictor_3 -0.84486 0.32056 -2.636 0.0084 ** predictor_6 0.65392 0.34348 1.904 0.0569 . predictor_7 0.71730 0.29596 2.424 0.0154 * predictor_8 -1.37692 0.75660 -1.820 0.0688 . predictor_9 0.15642 0.28969 0.540 0.5892 predictor_2:predictor_6 -0.46880 0.18829 -2.490 0.0128 * predictor_2:predictor_7 4.97365 0.82692 6.015 1.80e-09 *** predictor_3:predictor_7 -1.13192 0.46639 -2.427 0.0152 * predictor_2:predictor_8 -5.52913 0.88476 -6.249 4.12e-10 *** predictor_1:predictor_9 4.28519 NA NA NA predictor_2:predictor_9 -0.26558 0.10541 -2.520 0.0117 * predictor_3:predictor_9 -1.49790 NA NA NA predictor_6:predictor_9 -1.31538 NA NA NA predictor_7:predictor_9 -4.41998 NA NA NA predictor_8:predictor_9 3.99709 NA NA NA --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Threshold coefficients: Estimate Std. Error z value 0|1 -0.2236 0.3072 -0.728 1|2 1.4229 0.3634 3.915 (211 observations deleted due to missingness) Warning message: In sqrt(diag(vc)[1:npar]) : NaNs produced/ This warning is due to a (near) singular variance-covariance matrix of the model parameters, which in turn is due to the fact that the model converged to a boundary solution: both random effects variance parameters are zero. If you exclude the random terms and refit the model with clm, the variance-covariance matrix will probably be well defined and standard errors can be computed. Another thing is that you are fitting 17 regression parameters and 2 random effect terms (which in the end do not count) to only 103 observations. I would be worried about overfitting or perhaps even non-fitting. I think I would also be concerned about the 211 observations that are incomplete, and I would be careful with automatic model selection/averaging etc. on incomplete data (though I don't know how/if glmulti actually deals with that).
[R] Optimisation and NaN Errors using clm() and clmm()
Dear List, I am using both the clm() and clmm() functions from the R package 'ordinal'. I am fitting an ordinal dependent variable with 5 categories to 9 continuous predictors, all of which have been normalised (mean subtracted then divided by standard deviation), using a probit link function. From this global model I am generating a confidence set of 200 models using clm() and the 'glmulti' R package. This produces these errors: / model.2.10 - glmulti(as.factor(dependent) ~ predictor_1*predictor_2*predictor_3*predictor_4*predictor_5*predictor_6*predictor_7*predictor_8*predictor_9, data = database, fitfunc = clm, link = probit, method = g, crit = aicc, confsetsize = 200, marginality = TRUE) ... After 670 generations: Best model: as.factor(dependent)~1+predictor_1+predictor_2+predictor_3+predictor_4+predictor_5+predictor_6+predictor_8+predictor_9+predictor_4:predictor_3+predictor_6:predictor_2+predictor_8:predictor_5+predictor_9:predictor_1+predictor_9:predictor_4+predictor_9:predictor_5+predictor_9:predictor_6 Crit= 183.716706496392 Mean crit= 202.022138576506 Improvements in best and average IC have bebingo en below the specified goals. Algorithm is declared to have converged. Completed. There were 24 warnings (use warnings() to see them) warnings() Warning messages: 1: optimization failed: step factor reduced below minimum 2: optimization failed: step factor reduced below minimum 3: optimization failed: step factor reduced below minimum/ etc. I am then re-fitting each of the 200 models with the clmm() function, with 2 random factors (family nested within order). I get this error in a few of the re-fitted models: / model.2.glmm.2 - clmm(as.factor(dependent) ~ 1 + predictor_1 + predictor_2 + predictor_3 + predictor_6 + predictor_7 + predictor_8 + predictor_9 + predictor_6:predictor_2 + predictor_7:predictor_2 + predictor_7:predictor_3 + predictor_8:predictor_2 + predictor_9:predictor_1 + predictor_9:predictor_2 + predictor_9:predictor_3 + predictor_9:predictor_6 + predictor_9:predictor_7 + predictor_9:predictor_8+ (1|order/family), link = probit, data = database) summary(model.2.glmm.2) Cumulative Link Mixed Model fitted with the Laplace approximation formula: as.factor(dependent) ~ 1 + predictor_1 + predictor_2 + predictor_3 + predictor_6 + predictor_7 + predictor_8 + predictor_9 + predictor_6:predictor_2 + predictor_7:predictor_2 + predictor_7:predictor_3 + predictor_8:predictor_2 + predictor_9:predictor_1 + predictor_9:predictor_2 + predictor_9:predictor_3 + predictor_9:predictor_6 + predictor_9:predictor_7 + predictor_9:predictor_8 + (1 | order/family) data: database link threshold nobs logLik AIC niter max.grad cond.H probit flexible 103 -65.56 173.13 58(3225) 8.13e-06 4.3e+03 Random effects: Var Std.Dev family:order 7.493e-11 8.656e-06 order 1.917e-12 1.385e-06 Number of groups: family:order 12, order 4 Coefficients: Estimate Std. Error z value Pr(|z|) predictor_1 0.40802 0.78685 0.519 0.6041 predictor_2 0.02431 0.26570 0.092 0.9271 predictor_3 -0.84486 0.32056 -2.636 0.0084 ** predictor_6 0.65392 0.34348 1.904 0.0569 . predictor_7 0.71730 0.29596 2.424 0.0154 * predictor_8 -1.37692 0.75660 -1.820 0.0688 . predictor_9 0.15642 0.28969 0.540 0.5892 predictor_2:predictor_6 -0.46880 0.18829 -2.490 0.0128 * predictor_2:predictor_7 4.97365 0.82692 6.015 1.80e-09 *** predictor_3:predictor_7 -1.13192 0.46639 -2.427 0.0152 * predictor_2:predictor_8 -5.52913 0.88476 -6.249 4.12e-10 *** predictor_1:predictor_9 4.28519 NA NA NA predictor_2:predictor_9 -0.26558 0.10541 -2.520 0.0117 * predictor_3:predictor_9 -1.49790 NA NA NA predictor_6:predictor_9 -1.31538 NA NA NA predictor_7:predictor_9 -4.41998 NA NA NA predictor_8:predictor_9 3.99709 NA NA NA --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Threshold coefficients: Estimate Std. Error z value 0|1 -0.2236 0.3072 -0.728 1|2 1.4229 0.3634 3.915 (211 observations deleted due to missingness) Warning message: In sqrt(diag(vc)[1:npar]) : NaNs produced/ I have tried a number of different approaches, each has its own problems. I have fixed these using various suggestions from online forums (eg https://stat.ethz.ch/pipermail/r-sig-mixed-models/2011q1/015328.html, https://stat.ethz.ch/pipermail/r-sig-mixed-models/2011q2/016165.html) and this is as good as I can get it. After the first stage (generating the model set with glmulti) I tested every model in the confidence set individually - there were no errors - but there was clearly a problem during the model selection process. Should I be worried? No errors appear in the top 5% of re-fitted models (which are the only ones I will be using) however I am concerned that errors may be indicative of a problem with my approach. A further worry is that the errors might be removing models that could otherwise be included. Any help would be much appreciated. Tom __ R-help@r-project.org mailing list