Re: [Rd] subset argument in nls() and possibly other functions

Marc Schwartz via R-devel Tue, 13 Jul 2021 18:14:51 -0700

Hi John,

In scanning some of the more popular model functions (e.g. lm(), glm(),lme(), coxph(), etc.), none seem to provide examples of the use of the'subset' argument, even though it is documented for them.


That being said, there is some old (2003) documentation by Prof Ripley here:

https://developer.r-project.org/model-fitting-functions.html

that may be helpful, and where the link to the lm() function source codeon the above page should be:


  https://svn.r-project.org/R/trunk/src/library/stats/R/lm.R

Within that source file, you might want to focus upon themodel.frame.lm() function, the basic form which is used internally inmany (most?, all?) of the typical model related functions in R to createthe internal data frame from the specified formula, that is then used tocreate the model.


There is a parallel model.frame.glm() function for glm() here:

https://svn.r-project.org/R/trunk/src/library/stats/R/glm.R

There is also a 2003 paper by Thomas Lumley on non-standard evaluationthat may be helpful:


https://developer.r-project.org/nonstandard-eval.pdf

The help for the generic ?model.frame has the following text for the'subset' argument:

I cannot recall off-hand, using the 'subset' argument myself in ~20years of using R, but do seem to recall some old discussions on thee-mail lists, which I cannot seem to locate at present. A search viarseek.org may yield some benefit.


Regards,

Marc Schwartz


J C Nash wrote on 7/13/21 7:21 PM:

In mentoring and participating in a Google Summer of Code project "Improvements to 
nls()",
I've not found examples of use of the "subset" argument in the call to nls(). 
Moreover,
in searching through the source code for the various functions related to 
nls(), I can't
seem to find where subset is used, but a simple example, included below, 
indicates it works.
Three approaches all seem to give the same results.

Can someone point to documentation or code so we can make sure we get our 
revised programs
to work properly? The aim is to make them more maintainable and provide 
maintainer documentation,
along with some improved functionality. We seem, for example, to already be 
able to offer
analytic derivatives where they are feasible, and should be able to add 
Marquardt-Levenberg
stabilization as an option.

Note that this "subset" does not seem to be the "subset()" function of R.

John Nash

# CroucherSubset.R -- https://walkingrandomly.com/?p=5254

xdata = c(-2,-1.64,-1.33,-0.7,0,0.45,1.2,1.64,2.32,2.9)
ydata = 
c(0.699369,0.700462,0.695354,1.03905,1.97389,2.41143,1.91091,0.919576,-0.730975,-1.42001)
Cform <- ydata ~ p1*cos(p2*xdata) + p2*sin(p1*xdata)
Cstart<-list(p1=1,p2=0.2)
Cdata<-data.frame(xdata, ydata)
Csubset<-1:8 # just first 8 points

# Original problem - no subset
fit0 = nls(ydata ~ p1*cos(p2*xdata) + p2*sin(p1*xdata), data=Cdata, 
start=list(p1=1,p2=.2))
summary(fit0)

# via subset argument
fit1 = nls(ydata ~ p1*cos(p2*xdata) + p2*sin(p1*xdata), data=Cdata, 
start=list(p1=1,p2=.2), subset=Csubset)
summary(fit1)

# via explicit subsetting
Csdata <- Cdata[Csubset, ]
Csdata
fit2 = nls(ydata ~ p1*cos(p2*xdata) + p2*sin(p1*xdata), data=Csdata, 
start=list(p1=1,p2=.2))
summary(fit2)

# via weights -- seems to give correct observation count if zeros not recognized
wts <- c(rep(1,8), rep(0,2))
fit3 = nls(ydata ~ p1*cos(p2*xdata) + p2*sin(p1*xdata), data=Cdata, 
weights=wts, start=list(p1=1,p2=.2))
summary(fit3)

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] subset argument in nls() and possibly other functions

Reply via email to