[R] glm fit
Hi, I try to ask here, because I hope someone will help me understand this problem- I have fittet a glm in R with the results glm1 - glm(log(claims)~log(sum)*as.factor(grp),family=gaussian(link=identity)) summary(glm1) Call: glm(formula = log(claims) ~ log(sum) * as.factor(grp), family = gaussian(link = identity)) Deviance Residuals: Min 1Q Median 3Q Max -6.6836 -1.3626 -0.2576 1.2038 8.2480 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 3.525657 0.436102 8.084 8.18e-16 *** log(sum)0.334288 0.025668 13.024 2e-16 *** as.factor(grp)2 0.434262 0.976240 0.445 0.6565 as.factor(grp)3 3.666490 1.436471 2.552 0.0107 * as.factor(grp)4 0.040782 1.024730 0.040 0.9683 log(sum):as.factor(grp)2 0.007719 0.061914 0.125 0.9008 log(sum):as.factor(grp)3 -0.209986 0.091578 -2.293 0.0219 * log(sum):as.factor(grp)4 0.059342 0.067320 0.881 0.3781 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for gaussian family taken to be 3.693731) Null deviance: 15839 on 4035 degrees of freedom Residual deviance: 14878 on 4028 degrees of freedom AIC: 16737 Number of Fisher Scoring iterations: 2 But i'm not sure what I get out of the summary. What does it tell me? How should the formula for Y look like, out from the summary? Is it something like Y=0.334288*X_sum+0.434262*X_2+ ?? And when I get a expression for Y, what does it tell me? Is it an expected expression? Can anyone help me please? -- View this message in context: http://www.nabble.com/glm-fit-tp23443121p23443121.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Summary help
Hi, I have fittet a gamma model, and is wondering if I can read the shape and the scale direct from the summary Estimate Std. Errort valuePr(|t|) (Intercept) 1.612e+00 4.735e-02 34.052 2e-16 *** myvalue 3.564e-02 2.787e-03 12.788 2e-16 *** ... Is the shape = 1.1612e+00 and the scale = 3.564e-02 ?? is it the other way around, or can't it be done? -- View this message in context: http://www.nabble.com/Summary-help-tp23406265p23406265.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Summary help
To glm is glm(log(mydata)~log(max_data)*as.factor(grp),family=Gamma(link=log)) And I was wondering if you can read the scale and shape from summary There a quite a few gamma models around, so you should tell us more. glmXXX? lmer? Dieter __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Summary-help-tp23406265p23410810.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Kolmogorov-Smirnov test
I got a distribution function and a empirical distribution function. How do I make to Kolmogorov-Smirnov test in R. Lets call the empirical distribution function Fn on [0,1] and the distribution function F on [0,1] ks.test( ) thanks for the help -- View this message in context: http://www.nabble.com/Kolmogorov-Smirnov-test-tp23296096p23296096.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to read the summary
How can I from the summary function, decide which glm (fit1, fit2 or fit3) fits to data best? I don't know what to look after, so I would please explain the important output. fit1 - glm(Y~X, family=gaussian(link=identity)) fit2 - glm(Y~X, family=gaussian(link=log)) fit3 - glm(Y~X, family=Gamma(link=log)) summary(fit1) Call: glm(formula = Y ~ X, family = gaussian(link = identity)) Deviance Residuals: Min 1Q Median 3Q Max -3.6619 -1.9693 -0.4119 2.0787 3.9664 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) -0.4285 1.6213 -0.264 0.798258 X 4.3952 0.7089 6.200 0.000259 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for gaussian family taken to be 6.784605) Null deviance: 315.081 on 9 degrees of freedom Residual deviance: 54.277 on 8 degrees of freedom AIC: 51.294 Number of Fisher Scoring iterations: 2 summary(fit2) Call: glm(formula = Y ~ X, family = gaussian(link = log)) Deviance Residuals: Min 1Q Median 3Q Max -1.5489 -0.2960 0.4776 0.6353 1.2773 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 0.505370.16562 3.051 0.0158 * X0.663520.05083 13.055 1.13e-06 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for gaussian family taken to be 1.083989) Null deviance: 315.0810 on 9 degrees of freedom Residual deviance: 8.6718 on 8 degrees of freedom AIC: 32.954 Number of Fisher Scoring iterations: 6 summary(fit3) Call: glm(formula = Y ~ X, family = Gamma(link = log)) Deviance Residuals: Min1QMedian3Q Max -0.35269 -0.09272 0.02550 0.13625 0.18018 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 0.859590.11244 7.645 6.04e-05 *** X0.531340.04916 10.808 4.74e-06 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for Gamma family taken to be 0.03262828) Null deviance: 4.31315 on 9 degrees of freedom Residual deviance: 0.28385 on 8 degrees of freedom AIC: 36.65 Number of Fisher Scoring iterations: 5 -- View this message in context: http://www.nabble.com/How-to-read-the-summary-tp23276848p23276848.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Generalized linear models (GLM)
Hi I got a dataset loss max.loss grp 1 10 50 2 2 15 33 1 3 18 49 2 4 33 38 1 5 8 50 3 6 19 29 1 7 22 51 4 8 50 50 2 9 16 38 1 1024 30 3 were loss and max.loss are monetary values (in dollar). Grp is group number. By use of GLM, I have to determine the effect of max.loss and grp (and interactions between them) on loss. My question is how to do this. Is it something like glm(max.loss~loss,family=gaussian(link=identity) were ofcourse I can change gaussian with Gamma,... and link with log,... But am I on right track, or what should I change? Thanks -- View this message in context: http://www.nabble.com/Generalized-linear-models-%28GLM%29-tp23279588p23279588.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generalized linear models (GLM)
Actually both max.loss and loss are known values (in dollars). I'm very much doubt, what to choose. glm(max.loss~loss,family=gaussian(link=identity) or glm(formula = sum ~ claims * as.factor(grp), family = gaussian(link = identity)) or glm(loss~max.loss,family=gaussian(link=identity) we have to look at gaussian and gamma, with link identity and log. But my problem is what is going to be between the ~ David Winsemius wrote: I think you are off-track because max.loss does not sound like a proper Y variable. Because max.loss is an amount that is known, in the insurance applications I have seen it would have been modeled within an offset term. Many of the examples have used number of ships or buildings or the person years of exposure but I do not see that the general strategy is limited to only such considerations. I would also suggest that you consider links other than Gaussian, perhaps negative binomial. The task for the analyst is then to translate output from the chosen model into interpretable meaning on the scale of interest, but I assume your course instructor will help with that. -- David Winsemius On Apr 28, 2009, at 11:34 AM, mathallan wrote: Hi I got a dataset loss max.loss grp 1 10 50 2 2 15 33 1 3 18 49 2 4 33 38 1 5 8 50 3 6 19 29 1 7 22 51 4 8 50 50 2 9 16 38 1 1024 30 3 were loss and max.loss are monetary values (in dollar). Grp is group number. By use of GLM, I have to determine the effect of max.loss and grp (and interactions between them) on loss. My question is how to do this. Is it something like glm(max.loss~loss,family=gaussian(link=identity) were ofcourse I can change gaussian with Gamma,... and link with log,... But am I on right track, or what should I change? Thanks -- View this message in context: http://www.nabble.com/Generalized-linear-models-%28GLM%29-tp23279588p23279588.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Generalized-linear-models-%28GLM%29-tp23279588p23283633.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Generalized linear models
Thanks for the answer David Sum er the sum insured the maximal loss of the company. Claims, is the actually claim size. Group is wich type of business is insured. Can you help me to solve the problem? It is very difficult to determine rightness since you have omitted essential background information. The most glaring omission is what sort of data is in sum. If this is either the number of policies or the dollar amount at risk then a categorical NO is the answer to the question. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Generalized-linear-models-tp23265349p23271211.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Generalized linear models
I have to fit a generalized linear model in R, and I have never done this before, so I'm in very much doubt. I have a dataset (of 4036 observations) claims sum grp 1 3852 345702931 2 1194 7776468 1 3 3916 26343305 1 4 1258 5502915 1 5 11594 711453346 1 ... there are 4 groups (grp). The task is to determine the effect of sum and grp (and interactions between them) on the claims. I have to test using different link functions and distributions What I think I should do is (in R) glm(claims~sum*grp, family=gaussian(link=log)) Call: glm(formula = claims ~ sum * grp, family = gaussian(link = log)) Coefficients: (Intercept) sum grp sum:grp 1.215e+01 -4.466e-096.814e-025.294e-09 Degrees of Freedom: 4035 Total (i.e. Null); 4032 Residual Null Deviance: 3.371e+16 Residual Deviance: 3.355e+16AIC: 131500 Is this right? And how can the output be interpreted? Did I even answer the question, and how can I plot a curve to the oberservations? /Thank you so much for helping -- View this message in context: http://www.nabble.com/Generalized-linear-models-tp23265349p23265349.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.