On Fri, 7 Dec 2007, Gabor Grothendieck wrote:
On Dec 7, 2007 8:10 AM, Peter Dalgaard <[EMAIL PROTECTED]> wrote:
Ben Bolker wrote:
At this point I'd just like to advertise the "bbmle" package
(on CRAN) for those who respectfully disagree, as I do, with Peter over
this issue. I have added a data= argument to my version
of the function that allows other variables to be passed
to the objective function. It seems to me that this is perfectly
in line with the way that other modeling functions in R
behave.
This is at least cleaner than abusing the "fixed" argument. As you know,
I have reservations, one of which is that it is not a given that I want
it to behave just like other modeling functions, e.g. a likelihood
function might refer to more than one data set, and/or data that are not
structured in the traditional data frame format. The design needs more
thought than just adding arguments.
I still prefer a design based a plain likelihood function. Then we can
discuss how to construct such a function so that the data are
incorporated in a flexible way. There are many ways to do this, I've
shown one, here's another:
f <- function(lambda) -sum(dpois(x, lambda, log=T))
d <- data.frame(x=rpois(10000, 12.34))
environment(f)<-evalq(environment(),d)
mle(f, start=list(lambda=10))
Call:
mle(minuslogl = f, start = list(lambda = 10))
Coefficients:
lambda
12.3402
The explicit environment manipulation is what I was referring to but
I make extensive use of lexical scoping in my programming and I NEVER
use explicit environment manipulaiton--for me that is unreadable, and it
is not amenable to checking using things like codetools. In the
example above all that is needed is to define x directly, e.g.
> f <- function(lambda) -sum(dpois(x, lambda, log=T))
> x <- rpois(10000, 12.34)
> mle(f, start=list(lambda=10))
Call:
mle(minuslogl = f, start = list(lambda = 10))
Coefficients:
lambda
12.337
It isn't necessary to go through the data frame or environment
munging. If you want to be able to work with likelihoods for several
data sets at once then you can either use diferent names for the
variables, like
x1 <- rpois(10000, 12.34)
f1 <- function(lambda) -sum(dpois(x1, lambda, log=T))
x2 <- rpois(10000, 12.34)
f2 <- function(lambda) -sum(dpois(x2, lambda, log=T))
If you are concerned that x, x1, x2 might have been redefined if you
come back to f1, f2 later (not an issue with typical useage inside a
function but can be an issue at top level) then you can create a
closure that cpatures the particular data set you are using. The
clean way to do this is with a function that creates the negative log
likelihood, e.g.
makePoisonNegLogLikelihood <- function(x)
function(lambda) -sum(dpois(x, lambda, log=T))
Then you can do
f <- makePoisonNegLogLikelihood(rpois(10000, 12.34))
mle(f, start=list(lambda=10))
which I find much cleaner and easier to understand than environment
munging. Once you are defining a likelihood constructor you can think
about things like making it a bit more efficient by calculating
sufficient statistics once, for example
makePoisonNegLogLikelihood <- function(x) {
sumX <- sum(x)
n <- length(x)
function(lambda) -dpois(sumX, n * lambda, log=T)
}
Best,
luke
we can simplify it using proto. Create a proto object to hold
f and x then pass the f in the proto object (rather than the
original f) to mle. That works because proto automatically resets
the environment of f when its added to avoiding the evalq.
set.seed(1)
library(proto)
f <- function(lambda) -sum(dpois(x, lambda, log=TRUE))
p <- proto(f = f, x = rpois(100, 12.34))
mle(p[["f"]], start = list(lambda = 10))
Call:
mle(minuslogl = p[["f"]], start = list(lambda = 10))
Coefficients:
lambda
12.46000
It is not at all an unlikely design to have mle() as a generic function
which works on many kinds of objects, the default method being
function(object,...) mle(minuslogl(obj)) and minuslogl is an extractor
function returning (tada!) the negative log likelihood function.
(My version also has a cool formula interface and other
bells and whistles, and I would love to get feedback from other
useRs about it.)
cheers
Ben Bolker
--
O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
--
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa Phone: 319-335-3386
Department of Statistics and Fax: 319-335-3017
Actuarial Science
241 Schaeffer Hall email: [EMAIL PROTECTED]
Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel