We encounted a performance problem when a large number of R scripts are run 
simulatanously.  A large number of stat() system calls to /etc/timezone was 
limiting how many scripts could be run effectively.  I traced the problem to 
as.Date.character where strptime() is called without a timezone argument when 
there is no format argument.

as.Date.character <- function(x, format="", ...)
{
    charToDate <- function(x) {
        xx <- x[1L]
        if(is.na(xx)) {
            j <- 1L
            while(is.na(xx) && (j <- j+1L) <= length(x)) xx <- x[j]
            if(is.na(xx)) f <- "%Y-%m-%d" # all NAs
        }
        if(is.na(xx) ||
           !is.na(strptime(xx, f <- "%Y-%m-%d", tz="GMT")) ||
           !is.na(strptime(xx, f <- "%Y/%m/%d", tz="GMT"))
           ) return(strptime(x, f))
        stop("character string is not in a standard unambiguous format")
    }
    res <- if(missing(format)) charToDate(x) else strptime(x, format, tz="GMT")
    as.Date(res)
}

We could easily workaround this by specifying a format.  My question is, should 
strptime(x, f) have a tz argument as in the case where a format is specified?

Thanks,

Paul

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to