> From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Thomas Mang > Sent: Monday, November 24, 2008 1:02 AM > To: [EMAIL PROTECTED] > Subject: [Rd] timezone attribute lost > > Hi, > > As I didn't get any response on the general help list and I > don't know > if there is a bug in action I am trying my luck here. > > I was highly surprised to find out that during simple operations (see > code below) the timezone attribute for POSIXct data is lost and then, > upon the next interpretation, the system settings are used (which are > plain wrong in my case). > > I have used R 2.8.0 under Windows XP with the system timezone > (managed > by Windows) set to CET - I suppose however that all other timezones, > with the exception of GMT, will show similiar surprising > behavior (and > those who live in GMT-zone: If you change your timezone > setting please > restart R, otherwise the effect won't take place). > > > # input data > # note that the timezone is deliberately set to GMT, and of course I > want the operations below to take place in GMT-time > Time = as.POSIXct(strptime(c("2007-12-12 14:30:15", "2008-04-14 > 15:31:34", "2008-04-14 15:31:34"), format = "%Y-%m-%d %H:%M:%S", tz = > "GMT")) > Time # OK, time zone is GMT > attr(Time, "tzone") # OK, time zone is GMT > > > # Surprise 1: > TApply = tapply(1:3, Time, max) > names(TApply) # wrong, names are converted to time zone of system > > # Surprise 2: > UTime = unique(Time) > UTime # wrong, again time zone of system is used > attr(UTime, "tzone") # == NULL > > > I know how to "solve" the problem (for example by setting an R system > variable TZ to GMT), but I wonder why is this mess happening at all? > Moreover, is this behavior considered to be a feature, or a > plain bug ?
All of those problems are due to a problem in unique.default(), which sends the integer data in POSIXct, Date, and factor objects through a .Internal and then tries to reconstruct the original sort of object from the integer output of that .Internal() z <- .Internal(unique(x, incomparables, fromLast)) if (is.factor(x)) factor(z, levels = seq_len(nlevels(x)), labels = levels(x), ordered = is.ordered(x)) else if (inherits(x, "POSIXct") || inherits(x, "Date")) structure(z, class = class(x)) Your immediately problem could be solved by adding tzone=attr(x,"tzone") to the structure call, but I'm not familiar enough with classes inheriting from POSIXct and Date to know if that is sufficient. There is no reason someone won't make a new subclass where another attribute is essential. Since .Internal used the equivalent of as.numeric(x) to extract numeric codes, it might be nice to have an as.numeric<-(x,value) function that could insert numeric codes back into a dataset so you could avoid reconstructing an object of unknown structure with as.numeric(x)<-z (or perhaps as.vector<- should be used so you don't have to know what the integer type is). In S and S+ one can use [EMAIL PROTECTED]<-newNumericCodes for this sort of thing, but that can be dangerous because it lets you stick in inappropriate types. One might think that adding a new unique method for POSIXct or Date or things subclassed from them would be the right way to structure things, but factor() explicitly calls unique.default(). > Thanks, > Thomas > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > Bill Dunlap TIBCO Software Inc - Spotfire Division wdunlap tibco.com ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel