Re: [Rd] [R] unvectorized option for outer()

Thomas Lumley Sun, 30 Oct 2005 13:03:05 -0800

On Sun, 30 Oct 2005, Jonathan Rougier wrote:

> I'm not sure about this.  Perhaps I am a dinosaur, but my feeling is
> that if people are writing functions in R that might be subject to
> simple operations like outer products, then they ought to be writing
> vectorised functions!


I would agree.  How about an oapply() function that does multiway (rather 
than just two-way) outer products.  Basing the name on "apply" would 
emphasize the similarity to other flexible, not particularly optimized 
second-order functions.

        -thomas



>               Maybe it's not possible to hold this line, and
> maybe "outer" is not the right place to draw it, but I think we ought to
> respect the "x is a vector" mindset as much as possible in the base
> package.  As Tony notes, the documentation does try to be clear about
> what outer actually does, and how it can be used.
>
> So I am not a fan of the VECTORIZED argument, and definitely not a fan
> of the VECTORIZED=FALSE default.
>
> Jonathan.
>
> Gabor Grothendieck wrote:
>> If the default were changed to VECTORIZED=FALSE then it would
>> still be functionally compatible with what we have now so all existing
>> software would continue to run correctly yet would not cause
>> problems for the unwary.  Existing software would not have to be changed
>> to add VECTORIZED=TRUE except for those, presumbly few, cases
>> where outer performance is critical.   One optimization might be to
>> have the default be TRUE if the function is * or perhaps if its specified
>> as a single character and FALSE otherwise.
>>
>> Having used APL I always regarded the original design of outer in R as
>> permature performance optimization and this would be a chance to get
>> it right.
>>
>> On 10/28/05, Tony Plate <[EMAIL PROTECTED]> wrote:
>>
>>> [following on from a thread on R-help, but my post here seems more
>>> appropriate to R-devel]
>>>
>>> Would a patch to make outer() work with non-vectorized functions be
>>> considered?  It seems to come up moderately often on the list, which
>>> probably indicates that many many people get bitten by the same
>>> incorrect expectation, despite the documentation and the FAQ entry.  It
>>> looks pretty simple to modify outer() appropriately: one extra function
>>> argument and an if-then-else clause to call mapply(FUN, ...) instead of
>>> calling FUN directly.
>>>
>>> Here's a function demonstrating this:
>>>
>>> outer2 <- function (X, Y, FUN = "*", ..., VECTORIZED=TRUE)
>>> {
>>>    no.nx <- is.null(nx <- dimnames(X <- as.array(X)))
>>>    dX <- dim(X)
>>>    no.ny <- is.null(ny <- dimnames(Y <- as.array(Y)))
>>>    dY <- dim(Y)
>>>    if (is.character(FUN) && FUN == "*") {
>>>        robj <- as.vector(X) %*% t(as.vector(Y))
>>>        dim(robj) <- c(dX, dY)
>>>    }
>>>    else {
>>>        FUN <- match.fun(FUN)
>>>        Y <- rep(Y, rep.int(length(X), length(Y)))
>>>        if (length(X) > 0)
>>>            X <- rep(X, times = ceiling(length(Y)/length(X)))
>>>        if (VECTORIZED)
>>>            robj <- FUN(X, Y, ...)
>>>        else
>>>            robj <- mapply(FUN, X, Y, MoreArgs=list(...))
>>>        dim(robj) <- c(dX, dY)
>>>    }
>>>    if (no.nx)
>>>        nx <- vector("list", length(dX))
>>>    else if (no.ny)
>>>        ny <- vector("list", length(dY))
>>>    if (!(no.nx && no.ny))
>>>        dimnames(robj) <- c(nx, ny)
>>>    robj
>>> }
>>> # Some examples
>>> f <- function(x, y, p=1) {cat("in f\n"); (x*y)^p}
>>> outer2(1:2, 3:5, f, 2)
>>> outer2(numeric(0), 3:5, f, 2)
>>> outer2(1:2, numeric(0), f, 2)
>>> outer2(1:2, 3:5, f, 2, VECTORIZED=F)
>>> outer2(numeric(0), 3:5, f, 2, VECTORIZED=F)
>>> outer2(1:2, numeric(0), f, 2, VECTORIZED=F)
>>>
>>> # Output on examples
>>>
>>>> f <- function(x, y, p=1) {cat("in f\n"); (x*y)^p}
>>>> outer2(1:2, 3:5, f, 2)
>>>
>>> in f
>>>     [,1] [,2] [,3]
>>> [1,]    9   16   25
>>> [2,]   36   64  100
>>>
>>>> outer2(numeric(0), 3:5, f, 2)
>>>
>>> in f
>>>     [,1] [,2] [,3]
>>>
>>>> outer2(1:2, numeric(0), f, 2)
>>>
>>> in f
>>>
>>> [1,]
>>> [2,]
>>>
>>>> outer2(1:2, 3:5, f, 2, VECTORIZED=F)
>>>
>>> in f
>>> in f
>>> in f
>>> in f
>>> in f
>>> in f
>>>     [,1] [,2] [,3]
>>> [1,]    9   16   25
>>> [2,]   36   64  100
>>>
>>>> outer2(numeric(0), 3:5, f, 2, VECTORIZED=F)
>>>
>>>     [,1] [,2] [,3]
>>>
>>>> outer2(1:2, numeric(0), f, 2, VECTORIZED=F)
>>>
>>> [1,]
>>> [2,]
>>>
>>> If a patch to add this feature would be considered, I'd be happy to
>>> submit one (including documentation).  If so, and if there are any
>>> potential traps I should bear in mind, please let me know!
>>>
>>> -- Tony Plate
>>>
>>> Rau, Roland wrote:
>>>
>>>> Dear all,
>>>>
>>>> a big thanks to Thomas Lumley, James Holtman and Tony Plate for their
>>>> answers. They all pointed in the same direction => I need a vectorized
>>>> function to be applied. Hence, I will try to work with a 'wrapper'
>>>> function as described in the FAQ.
>>>>
>>>> Thanks again,
>>>> Roland
>>>>
>>>>
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: Thomas Lumley [mailto:[EMAIL PROTECTED]
>>>>> Sent: Thursday, October 27, 2005 11:39 PM
>>>>> To: Rau, Roland
>>>>> Cc: r-help@stat.math.ethz.ch
>>>>> Subject: Re: [R] outer-question
>>>>>
>>>>>
>>>>> You want FAQ 7.17 Why does outer() behave strangely with my function?
>>>>>
>>>>>     -thomas
>>>>>
>>>>> On Thu, 27 Oct 2005, Rau, Roland wrote:
>>>>>
>>>>>
>>>>>
>>>>>> Dear all,
>>>>>>
>>>>>> This is a rather lengthy message, but I don't know what I
>>>>>
>>>>> made wrong in
>>>>>
>>>>>
>>>>>> my real example since the simple code works.
>>>>>> I have two variables a, b and a function f for which I would like to
>>>>>> calculate all possible combinations of the values of a and b.
>>>>>> If f is multiplication, I would simply do:
>>>>>>
>>>>>> a <- 1:5
>>>>>> b <- 1:5
>>>>>> outer(a,b)
>>>>>>
>>>>>> ## A bit more complicated is this:
>>>>>> f <- function(a,b,d) {
>>>>>>    return(a*b+(sum(d)))
>>>>>> }
>>>>>> additional <- runif(100)
>>>>>> outer(X=a, Y=b, FUN=f, d=additional)
>>>>>>
>>>>>> ## So far so good. But now my real example. I would like to plot the
>>>>>> ## log-likelihood surface for two parameters alpha and beta of
>>>>>> ## a Gompertz distribution with given data
>>>>>>
>>>>>> ### I have a function to generate random-numbers from a
>>>>>> Gompertz-Distribution
>>>>>> ### (using the 'inversion method')
>>>>>>
>>>>>> random.gomp <- function(n, alpha, beta) {
>>>>>>          return( (log(1-(beta/alpha*log(1-runif(n)))))/beta)
>>>>>> }
>>>>>>
>>>>>> ## Now I generate some 'lifetimes'
>>>>>> no.people <- 1000
>>>>>> al <- 0.1
>>>>>> bet <- 0.1
>>>>>> lifetimes <- random.gomp(n=no.people, alpha=al, beta=bet)
>>>>>>
>>>>>> ### Since I neither have censoring nor truncation in this
>>>>>
>>>>> simple case,
>>>>>
>>>>>
>>>>>> ### the log-likelihood should be simply the sum of the log of the
>>>>>> ### the densities (following the parametrization of
>>>>>
>>>>> Klein/Moeschberger
>>>>>
>>>>>
>>>>>> ### Survival Analysis, p. 38)
>>>>>>
>>>>>> loggomp <- function(alphas, betas, timep) {
>>>>>> return(sum(log(alphas) + betas*timep + (alphas/betas *
>>>>>> (1-exp(betas*timep)))))
>>>>>> }
>>>>>>
>>>>>> ### Now I thought I could obtain a matrix of the
>>>>>
>>>>> log-likelihood surface
>>>>>
>>>>>
>>>>>> ### by specifying possible values for alpha and beta with the given
>>>>>> data.
>>>>>> ### I was able to produce this matrix with two for-loops.
>>>>>
>>>>> But I thought
>>>>>
>>>>>
>>>>>> ### I could use also 'outer' in this case.
>>>>>> ### This is what I tried:
>>>>>>
>>>>>> possible.alphas <- seq(from=0.05, to=0.15, length=30)
>>>>>> possible.betas <- seq(from=0.05, to=0.15, length=30)
>>>>>>
>>>>>> outer(X=possible.alphas, Y=possible.betas, FUN=loggomp,
>>>>>
>>>>> timep=lifetimes)
>>>>>
>>>>>
>>>>>> ### But the result is:
>>>>>>
>>>>>>
>>>>>>> outer(X=possible.alphas, Y=possible.betas, FUN=loggomp,
>>>>>>
>>>>>> timep=lifetimes)
>>>>>> Error in outer(X = possible.alphas, Y = possible.betas, FUN
>>>>>
>>>>> = loggomp,
>>>>>
>>>>>
>>>>>> :
>>>>>>      dim<- : dims [product 900] do not match the length
>>>>>
>>>>> of object [1]
>>>>>
>>>>>
>>>>>> In addition: Warning messages:
>>>>>> ...
>>>>>>
>>>>>> ### Can somebody give me some hint where the problem is?
>>>>>> ### I checked my definition of 'loggomp' but I thought this
>>>>>
>>>>> looks fine:
>>>>>
>>>>>
>>>>>> loggomp(alphas=possible.alphas[1], betas=possible.betas[1],
>>>>>> timep=lifetimes)
>>>>>> loggomp(alphas=possible.alphas[4], betas=possible.betas[10],
>>>>>> timep=lifetimes)
>>>>>> loggomp(alphas=possible.alphas[3], betas=possible.betas[11],
>>>>>> timep=lifetimes)
>>>>>>
>>>>>>
>>>>>> ### I'd appreciate any kind of advice.
>>>>>> ### Thanks a lot in advance.
>>>>>> ### Roland
>>>>>>
>>>>>>
>>>>>> +++++
>>>>>> This mail has been sent through the MPI for Demographic
>>>>>
>>>>> Rese...{{dropped}}
>>>>>
>>>>>
>>>>>> ______________________________________________
>>>>>> R-help@stat.math.ethz.ch mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide!
>>>>>
>>>>> http://www.R-project.org/posting-guide.html
>>>>>
>>>>> Thomas Lumley                 Assoc. Professor, Biostatistics
>>>>> [EMAIL PROTECTED]      University of Washington, Seattle
>>>>>
>>>>
>>>>
>>>> +++++
>>>> This mail has been sent through the MPI for Demographic Research.  Should 
>>>> you receive a mail that is apparently from a MPI user without this text 
>>>> displayed, then the address has most likely been faked. If you are 
>>>> uncertain about the validity of this message, please check the mail header 
>>>> or ask your system administrator for assistance.
>>>>
>>>>
>>>
>>> ______________________________________________
>>> R-help@stat.math.ethz.ch mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide! 
>>> http://www.R-project.org/posting-guide.html
>>>
>>
>>
>> ______________________________________________
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> -- 
> Jonathan Rougier                       Science Laboratories
> Department of Mathematical Sciences    South Road
> University of Durham                   Durham DH1 3LE
> tel: +44 (0)191 334 3111, fax: +44 (0)191 334 3051
> http://www.maths.dur.ac.uk/stats/people/jcr/jcr.html
>
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

Thomas Lumley                   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]       University of Washington, Seattle

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [R] unvectorized option for outer()

Reply via email to