Re: [R] Quality of fit statistics for NLS?

2012-01-27 Thread peter dalgaard

On Jan 26, 2012, at 22:51 , Bert Gunter wrote:

 Inline below.
 
 -- Bert
 
 On Thu, Jan 26, 2012 at 12:16 PM, Max Brondfield
 max.brondfi...@gmail.com wrote:
 Dear all,
 I am trying to analyze some non-linear data to which I have fit a curve of
 the following form:
 
 dum - nls(y~(A + (B*x)/(C+x)), start = list(A=370,B=100,C=23000))
 
 I am wondering if there is any way to determine meaningful quality of fit
 statistics from the nls function?
 
 A summary yields highly significant p-values, but it is my impression that
 these are questionable at best given the iterative nature of the fit:
 No. They are questionable primarily because there is no clear null
 model. They are based on profile likelihoods (as ?confint tells you),
 which may or may not be what you want for goodness of fit.
 
 One can always get goodness of fit statistics but the question in
 nonlinear models is: goodness of fit with respect to what? So the
 answer to your question is: if you know what you're doing, certainly.
 Otherwise, find someone who does.

...and if you are in the process of learning what you are doing: p-values are 
almost _never_ a good measure of goodness-of-fit, whereas the residual standard 
error might be, especially if you take a prediction approach to things. For 
one-dimensional curve fits, a graph of the data with the fitted curve is often 
what is really needed.

Also notice that summaries of fitted models are not useful for detecting 
systematic deviations from the model (like systematic over/under-estimation in 
some regions), for that you need diagnostic plots, and/or comparisons with 
extended models.

 
 
 
 
 summary(dum)
 
 Formula: y ~ (A + (B * x)/(C + x))
 
 Parameters:
   Estimate Std. Error t value Pr(|t|)
 A   388.753  4.794  81.090   2e-16 ***
 B   115.215  5.006  23.015   2e-16 ***
 C 20843.832   4646.937   4.485 1.12e-05 ***
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
 
 Residual standard error: 18.25 on 245 degrees of freedom
 
 Number of iterations to convergence: 4
 Achieved convergence tolerance: 2.244e-06
 
 
 Is there any other means of determining the quality of the curve fit? I
 have tried applying confidence intervals using confint(dum), but these
 curves seem unrealistically narrow. Thanks so much for your help!
 -Max
 
[[alternative HTML version deleted]]
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 -- 
 
 Bert Gunter
 Genentech Nonclinical Biostatistics
 
 Internal Contact Info:
 Phone: 467-7374
 Website:
 http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Quality of fit statistics for NLS?

2012-01-27 Thread John C Nash
Peter and Bert have already made some pertinent remarks. This comment is a bit 
tangential,
but in the same flavour. As they note, it is goodness of fit relative to 
what? that is
important.

As a matter of course when doing nonlinear least squares, I generally compute 
the quantity
   [1 - residual_sumsquares/(total sum of squares)].

In linear modelling this is usually called R-squared, but I don't want to 
create a
firestorm of complaints by suggesting it be called that here. I'm not doing 
anything here
other than a check for silly results. All I'm suggesting is that a comparison 
to the model
that is the mean of the variable being fitted is a minimal sanity check. Surely 
we should
be able to do better than the mean?  It's saved me from wasting time on several 
occasions,
sometimes because the model proposed was really wrong, sometimes because there 
was a
nuisance local minimum well away from a solution, and most often due to a silly 
typo in
setting things up. And it can usually be computed within a cat() statement.

Best, John Nash


On 01/27/2012 06:00 AM, r-help-requ...@r-project.org wrote:
 Message: 81
 Date: Fri, 27 Jan 2012 10:58:04 +0100
 From: peter dalgaard pda...@gmail.com
 To: Bert Gunter gunter.ber...@gene.com
 Cc: Max Brondfield max.brondfi...@gmail.com, r-help@r-project.org
 Subject: Re: [R] Quality of fit statistics for NLS?
 Message-ID: bdc6d36d-f152-41e0-87dc-38a28ccf3...@gmail.com
 Content-Type: text/plain; charset=windows-1252
 
 
 On Jan 26, 2012, at 22:51 , Bert Gunter wrote:
 
  Inline below.
  
  -- Bert
  
  On Thu, Jan 26, 2012 at 12:16 PM, Max Brondfield
  max.brondfi...@gmail.com wrote:
  Dear all,
  I am trying to analyze some non-linear data to which I have fit a curve 
  of
  the following form:
  
  dum - nls(y~(A + (B*x)/(C+x)), start = list(A=370,B=100,C=23000))
  
  I am wondering if there is any way to determine meaningful quality of fit
  statistics from the nls function?
  
  A summary yields highly significant p-values, but it is my impression 
  that
  these are questionable at best given the iterative nature of the fit:
  No. They are questionable primarily because there is no clear null
  model. They are based on profile likelihoods (as ?confint tells you),
  which may or may not be what you want for goodness of fit.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Quality of fit statistics for NLS?

2012-01-26 Thread Max Brondfield
Dear all,
I am trying to analyze some non-linear data to which I have fit a curve of
the following form:

dum - nls(y~(A + (B*x)/(C+x)), start = list(A=370,B=100,C=23000))

I am wondering if there is any way to determine meaningful quality of fit
statistics from the nls function?

A summary yields highly significant p-values, but it is my impression that
these are questionable at best given the iterative nature of the fit:

 summary(dum)

Formula: y ~ (A + (B * x)/(C + x))

Parameters:
   Estimate Std. Error t value Pr(|t|)
A   388.753  4.794  81.090   2e-16 ***
B   115.215  5.006  23.015   2e-16 ***
C 20843.832   4646.937   4.485 1.12e-05 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 18.25 on 245 degrees of freedom

Number of iterations to convergence: 4
Achieved convergence tolerance: 2.244e-06


Is there any other means of determining the quality of the curve fit? I
have tried applying confidence intervals using confint(dum), but these
curves seem unrealistically narrow. Thanks so much for your help!
-Max

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Quality of fit statistics for NLS?

2012-01-26 Thread Bert Gunter
Inline below.

-- Bert

On Thu, Jan 26, 2012 at 12:16 PM, Max Brondfield
max.brondfi...@gmail.com wrote:
 Dear all,
 I am trying to analyze some non-linear data to which I have fit a curve of
 the following form:

 dum - nls(y~(A + (B*x)/(C+x)), start = list(A=370,B=100,C=23000))

 I am wondering if there is any way to determine meaningful quality of fit
 statistics from the nls function?

 A summary yields highly significant p-values, but it is my impression that
 these are questionable at best given the iterative nature of the fit:
No. They are questionable primarily because there is no clear null
model. They are based on profile likelihoods (as ?confint tells you),
which may or may not be what you want for goodness of fit.

One can always get goodness of fit statistics but the question in
nonlinear models is: goodness of fit with respect to what? So the
answer to your question is: if you know what you're doing, certainly.
Otherwise, find someone who does.





 summary(dum)

 Formula: y ~ (A + (B * x)/(C + x))

 Parameters:
   Estimate Std. Error t value Pr(|t|)
 A   388.753      4.794  81.090   2e-16 ***
 B   115.215      5.006  23.015   2e-16 ***
 C 20843.832   4646.937   4.485 1.12e-05 ***
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

 Residual standard error: 18.25 on 245 degrees of freedom

 Number of iterations to convergence: 4
 Achieved convergence tolerance: 2.244e-06


 Is there any other means of determining the quality of the curve fit? I
 have tried applying confidence intervals using confint(dum), but these
 curves seem unrealistically narrow. Thanks so much for your help!
 -Max

        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.