[R] glm fit

2009-05-08 Thread mathallan

Hi, I try to ask here, because I hope someone will help me understand this
problem-

I have fittet a glm in R with the results

 glm1 -
 glm(log(claims)~log(sum)*as.factor(grp),family=gaussian(link=identity))
 summary(glm1)

Call:
glm(formula = log(claims) ~ log(sum) * as.factor(grp), family =
gaussian(link = identity))

Deviance Residuals: 
Min   1Q   Median   3Q  Max  
-6.6836  -1.3626  -0.2576   1.2038   8.2480  

Coefficients:
Estimate Std. Error  t value   Pr(|t|)
(Intercept) 3.525657   0.436102   8.084 8.18e-16 ***
log(sum)0.334288   0.025668  13.024   2e-16 ***
as.factor(grp)2   0.434262   0.976240   0.445   0.6565
as.factor(grp)3   3.666490   1.436471   2.552   0.0107 *  
as.factor(grp)4   0.040782   1.024730   0.040   0.9683
log(sum):as.factor(grp)2  0.007719   0.061914   0.125   0.9008
log(sum):as.factor(grp)3 -0.209986   0.091578  -2.293   0.0219 *  
log(sum):as.factor(grp)4  0.059342   0.067320   0.881   0.3781
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

(Dispersion parameter for gaussian family taken to be 3.693731)

Null deviance: 15839  on 4035  degrees of freedom
Residual deviance: 14878  on 4028  degrees of freedom
AIC: 16737

Number of Fisher Scoring iterations: 2


But i'm not sure what I get out of the summary. What does it tell me?

How should the formula for Y look like, out from the summary? Is it
something like

Y=0.334288*X_sum+0.434262*X_2+  ??


And when I get a expression for Y, what does it tell me? Is it an expected
expression?

Can anyone help me please?

-- 
View this message in context: 
http://www.nabble.com/glm-fit-tp23443121p23443121.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Summary help

2009-05-06 Thread mathallan

Hi, I have fittet a gamma model, and is wondering if I can read the shape and
the scale direct from the summary  

   Estimate Std. Errort valuePr(|t|)
(Intercept)  1.612e+00  4.735e-02  34.052   2e-16 ***
myvalue  3.564e-02  2.787e-03  12.788   2e-16 *** 
...

Is the shape = 1.1612e+00
and the scale = 3.564e-02 ??

is it the other way around, or can't it be done?
-- 
View this message in context: 
http://www.nabble.com/Summary-help-tp23406265p23406265.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Summary help

2009-05-06 Thread mathallan

To glm is

glm(log(mydata)~log(max_data)*as.factor(grp),family=Gamma(link=log))

And I was wondering if you can read the scale and shape from summary



There a quite a few gamma models around, so you should tell us more.
glmXXX? lmer?

Dieter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



-- 
View this message in context: 
http://www.nabble.com/Summary-help-tp23406265p23410810.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Kolmogorov-Smirnov test

2009-04-29 Thread mathallan

I got a distribution function and a empirical distribution function. How do I
make to Kolmogorov-Smirnov test in R.

Lets call the empirical distribution function Fn on [0,1]
   and the distribution function F  on [0,1]

ks.test(  )

thanks for the help
-- 
View this message in context: 
http://www.nabble.com/Kolmogorov-Smirnov-test-tp23296096p23296096.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to read the summary

2009-04-28 Thread mathallan

How can I from the summary function, decide which glm (fit1, fit2 or fit3)
fits to data best? I don't know what to look after, so I would please
explain the important output.

 fit1 - glm(Y~X, family=gaussian(link=identity))
 fit2 - glm(Y~X, family=gaussian(link=log))
 fit3 - glm(Y~X, family=Gamma(link=log))
 summary(fit1)

Call:
glm(formula = Y ~ X, family = gaussian(link = identity))

Deviance Residuals: 
Min   1Q   Median   3Q  Max  
-3.6619  -1.9693  -0.4119   2.0787   3.9664  

Coefficients:
Estimate Std. Error t value Pr(|t|)
(Intercept)  -0.4285 1.6213  -0.264 0.798258
X 4.3952 0.7089   6.200 0.000259 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

(Dispersion parameter for gaussian family taken to be 6.784605)

Null deviance: 315.081  on 9  degrees of freedom
Residual deviance:  54.277  on 8  degrees of freedom
AIC: 51.294

Number of Fisher Scoring iterations: 2

 summary(fit2)

Call:
glm(formula = Y ~ X, family = gaussian(link = log))

Deviance Residuals: 
Min   1Q   Median   3Q  Max  
-1.5489  -0.2960   0.4776   0.6353   1.2773  

Coefficients:
Estimate Std. Error t value Pr(|t|)
(Intercept)  0.505370.16562   3.051   0.0158 *  
X0.663520.05083  13.055 1.13e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

(Dispersion parameter for gaussian family taken to be 1.083989)

Null deviance: 315.0810  on 9  degrees of freedom
Residual deviance:   8.6718  on 8  degrees of freedom
AIC: 32.954

Number of Fisher Scoring iterations: 6

 summary(fit3)

Call:
glm(formula = Y ~ X, family = Gamma(link = log))

Deviance Residuals: 
 Min1QMedian3Q   Max  
-0.35269  -0.09272   0.02550   0.13625   0.18018  

Coefficients:
Estimate Std. Error t value Pr(|t|)
(Intercept)  0.859590.11244   7.645 6.04e-05 ***
X0.531340.04916  10.808 4.74e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

(Dispersion parameter for Gamma family taken to be 0.03262828)

Null deviance: 4.31315  on 9  degrees of freedom
Residual deviance: 0.28385  on 8  degrees of freedom
AIC: 36.65

Number of Fisher Scoring iterations: 5
-- 
View this message in context: 
http://www.nabble.com/How-to-read-the-summary-tp23276848p23276848.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Generalized linear models (GLM)

2009-04-28 Thread mathallan

Hi

I got a dataset

   loss max.loss   grp
 1 10 50 2
 2 15 33 1
 3 18 49 2
 4 33 38 1
 5  8  50 3
 6 19 29 1
 7 22 51 4
 8 50 50 2
 9 16 38 1
1024 30 3

were loss and max.loss are monetary values (in dollar). Grp is group number.

By use of GLM, I have to determine the effect of max.loss and grp (and
interactions between them) on loss. My question is how to do this.

Is it something like

glm(max.loss~loss,family=gaussian(link=identity)

were ofcourse I can change gaussian with Gamma,... and link with log,...

But am I on right track, or what should I change?


Thanks
-- 
View this message in context: 
http://www.nabble.com/Generalized-linear-models-%28GLM%29-tp23279588p23279588.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Generalized linear models (GLM)

2009-04-28 Thread mathallan

Actually both max.loss and loss are known values (in dollars). I'm very much
doubt, what to choose.


glm(max.loss~loss,family=gaussian(link=identity)

or

glm(formula = sum ~ claims * as.factor(grp), family = gaussian(link =
identity))

or
glm(loss~max.loss,family=gaussian(link=identity)

we have to look at gaussian and gamma, with link identity and log.

But my problem is what is going to be between the ~



David Winsemius wrote:
 
 I think you are off-track because max.loss does not sound like a  
 proper Y variable. Because max.loss is an amount that is known, in the  
 insurance applications I have seen it would have been modeled within  
 an offset term. Many of the examples have used number of ships or  
 buildings or the person years of exposure but I do not see that the  
 general strategy is limited to only such  considerations.
 
 I would also suggest that you consider links other than Gaussian,  
 perhaps negative binomial.
 
 The task for the analyst is then to translate output from the chosen  
 model into interpretable meaning on the scale of interest, but I  
 assume your course instructor will help with that.
 
 -- 
 David Winsemius
 On Apr 28, 2009, at 11:34 AM, mathallan wrote:
 

 Hi

 I got a dataset

   loss max.loss   grp
 1 10 50 2
 2 15 33 1
 3 18 49 2
 4 33 38 1
 5  8  50 3
 6 19 29 1
 7 22 51 4
 8 50 50 2
 9 16 38 1
 1024 30 3

 were loss and max.loss are monetary values (in dollar). Grp is group  
 number.

 By use of GLM, I have to determine the effect of max.loss and grp (and
 interactions between them) on loss. My question is how to do this.

 Is it something like

 glm(max.loss~loss,family=gaussian(link=identity)

 were ofcourse I can change gaussian with Gamma,... and link with  
 log,...

 But am I on right track, or what should I change?


 Thanks
 -- 
 View this message in context:
 http://www.nabble.com/Generalized-linear-models-%28GLM%29-tp23279588p23279588.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Generalized-linear-models-%28GLM%29-tp23279588p23283633.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Generalized linear models

2009-04-28 Thread mathallan

Thanks for the answer David

Sum er the sum insured the maximal loss of the company. Claims, is the
actually claim size. Group is wich type of business is insured.

Can you help me to solve the problem?



It is very difficult to determine rightness since you have omitted  
essential background information. The most glaring omission is what  
sort of data is in sum. If this is either the number of policies or  
the dollar amount at risk then a categorical NO is the answer to the  
question.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



-- 
View this message in context: 
http://www.nabble.com/Generalized-linear-models-tp23265349p23271211.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Generalized linear models

2009-04-27 Thread mathallan

I have to fit a generalized linear model in R, and I have never done this
before, so I'm in very much doubt.

I have a dataset (of 4036 observations)

  claims  sum grp
1   3852  345702931
2   1194  7776468   1
3   3916  26343305  1
4   1258  5502915  1
5   11594   711453346   1
...

there are 4 groups (grp).

The task is to determine the effect of sum and grp (and interactions between
them) on the claims.

I have to test using different link functions and distributions


What I think I should do is (in R)

 glm(claims~sum*grp, family=gaussian(link=log))

Call:  glm(formula = claims ~ sum * grp, family = gaussian(link = log)) 

Coefficients:
(Intercept)  sum  grp  sum:grp  
  1.215e+01   -4.466e-096.814e-025.294e-09  

Degrees of Freedom: 4035 Total (i.e. Null);  4032 Residual
Null Deviance:  3.371e+16 
Residual Deviance: 3.355e+16AIC: 131500 


Is this right? And how can the output be interpreted?

Did I even answer the question, and how can I plot a curve to the
oberservations?


/Thank you so much for helping


-- 
View this message in context: 
http://www.nabble.com/Generalized-linear-models-tp23265349p23265349.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.