Re: [Rd] POSIXlt matching bug
POSIXlt is a list and it is not a list of dates or times, it is a list of x - as.POSIXlt(Sys.Date()) names(x) [1] sec min hour mday mon year wday yday isdst So if you want to match these things, you should use POSIXct or any other numeric-based format (as POSIXct is just a double value for the number of seconds since 1970-01-01) e.g. z - as.POSIXct(Sys.Date()) x - as.POSIXct(Sys.Date()) z==x [1] TRUE match(z,x) [1] 1 z %in% x [1] TRUE Dr Oleg Sklyar Research Technologist AHL / Man Investments Ltd +44 (0)20 7144 3803 oskl...@maninvestments.com -Original Message- From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On Behalf Of McGehee, Robert Sent: 29 June 2010 15:46 To: r-b...@r-project.org; r-devel@r-project.org Subject: [Rd] POSIXlt matching bug I came across the below mis-feature/bug using match with POSIXlt objects (from strptime) in R 2.11.1 (though this appears to be an old issue). x - as.POSIXlt(Sys.Date()) table - as.POSIXlt(Sys.Date()+0:5) length(x) [1] 1 x %in% table # I expect TRUE [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE match(x, table) # I expect 1 [1] NA NA NA NA NA NA NA NA NA This behavior seemed more plausible when the length of a POSIXlt object was 9 (back in the day), however since the length was redefined, the length of x no longer matches the length of the match function output, as specified by the ?match documentation: A vector of the same length as 'x'. I would normally suggest that we add a POSIXlt method for match that converts x into POSIXct or character first. However, match does not appear to be generic. Below is a possible rewrite of match that appears to work as desired. match - function(x, table, nomatch = NA_integer_, incomparables = NULL) .Internal(match(if(is.factor(x)||inherits(x, POSIXlt)) as.character(x) else x, if(is.factor(table)||inherits(table, POSIXlt)) as.character(table) else table, nomatch, incomparables)) That said, I understand some people may be very sensitive to the speed of the match function, and may prefer a simple change to the ?match documentation noting this (odd) behavior for POSIXlt. Thanks, Robert R.version _ platform x86_64-unknown-linux-gnu arch x86_64 os linux-gnu system x86_64, linux-gnu status major 2 minor 11.1 year 2010 month 05 day31 svn rev52157 language R version.string R version 2.11.1 (2010-05-31) Robert McGehee, CFA Geode Capital Management, LLC One Post Office Square, 28th Floor | Boston, MA | 02109 Tel: 617/392-8396Fax:617/476-6389 mailto:robert.mcge...@geodecapital.com This e-mail, and any attachments hereto, are intended for use by the addressee(s) only and may contain information that is (i) confidential information of Geode Capital Management, LLC and/or its affiliates, and/or (ii) proprietary information of Geode Capital Management, LLC and/or its affiliates. If you are not the intended recipient of this e-mail, or if you have otherwise received this e-mail in error, please immediately notify me by telephone (you may call collect), or by e-mail, and please permanently delete the original, any print outs and any copies of the foregoing. Any dissemination, distribution or copying of this e-mail is strictly prohibited. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel ** Please consider the environment before printing this email or its attachments. The contents of this email are for the named addressees ...{{dropped:19}} __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] POSIXlt matching bug
RobMcG == McGehee, Robert robert.mcge...@geodecapital.com on Tue, 29 Jun 2010 10:46:06 -0400 writes: RobMcG I came across the below mis-feature/bug using match with POSIXlt objects RobMcG (from strptime) in R 2.11.1 (though this appears to be an old issue). x - as.POSIXlt(Sys.Date()) table - as.POSIXlt(Sys.Date()+0:5) length(x) RobMcG [1] 1 x %in% table # I expect TRUE RobMcG [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE match(x, table) # I expect 1 RobMcG [1] NA NA NA NA NA NA NA NA NA RobMcG This behavior seemed more plausible when the length of a POSIXlt object RobMcG was 9 (back in the day), however since the length was redefined, the RobMcG length of x no longer matches the length of the match function output, RobMcG as specified by the ?match documentation: A vector of the same length RobMcG as 'x'. RobMcG I would normally suggest that we add a POSIXlt method for match that RobMcG converts x into POSIXct or character first. However, match does not RobMcG appear to be generic. Below is a possible rewrite of match that appears RobMcG to work as desired. RobMcG match - function(x, table, nomatch = NA_integer_, incomparables = NULL) RobMcG .Internal(match(if(is.factor(x)||inherits(x, POSIXlt)) RobMcG as.character(x) else x, RobMcG if(is.factor(table)||inherits(table, POSIXlt)) RobMcG as.character(table) else table, RobMcG nomatch, incomparables)) RobMcG That said, I understand some people may be very sensitive to the speed RobMcG of the match function, yes, indeed. I'm currently investigating an alternative, considerably more programming time, but in the end should loose much less speed, is to .Internal()ize the tests in C code, so that the resulting R code would simply be match - function(x, table, nomatch = NA_integer_, incomparables = NULL) .Internal(x, table, nomatch, incomparables) Martin Maechler, ETH Zurich RobMcG and may prefer a simple change to the ?match RobMcG documentation noting this (odd) behavior for POSIXlt. RobMcG Thanks, Robert RobMcG R.version RobMcG _ RobMcG platform x86_64-unknown-linux-gnu RobMcG arch x86_64 RobMcG os linux-gnu RobMcG system x86_64, linux-gnu RobMcG status RobMcG major 2 RobMcG minor 11.1 RobMcG year 2010 RobMcG month 05 RobMcG day31 RobMcG svn rev52157 RobMcG language R RobMcG version.string R version 2.11.1 (2010-05-31) RobMcG Robert McGehee, CFA RobMcG Geode Capital Management, LLC RobMcG One Post Office Square, 28th Floor | Boston, MA | 02109 RobMcG Tel: 617/392-8396Fax:617/476-6389 RobMcG mailto:robert.mcge...@geodecapital.com This e-mail, and any attachments hereto, are intended for use by the RobMcG addressee(s) only and may contain information that is (i) confidential RobMcG information of Geode Capital Management, LLC and/or its affiliates, RobMcG and/or (ii) proprietary information of Geode Capital Management, LLC RobMcG and/or its affiliates. If you are not the intended recipient of this RobMcG e-mail, or if you have otherwise received this e-mail in error, please RobMcG immediately notify me by telephone (you may call collect), or by e-mail, RobMcG and please permanently delete the original, any print outs and any RobMcG copies of the foregoing. Any dissemination, distribution or copying of RobMcG this e-mail is strictly prohibited. RobMcG __ RobMcG R-devel@r-project.org mailing list RobMcG https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] POSIXlt matching bug
MM == Martin Maechler maech...@stat.math.ethz.ch on Fri, 2 Jul 2010 12:22:07 +0200 writes: RobMcG == McGehee, Robert robert.mcge...@geodecapital.com on Tue, 29 Jun 2010 10:46:06 -0400 writes: RobMcG I came across the below mis-feature/bug using match with POSIXlt objects RobMcG (from strptime) in R 2.11.1 (though this appears to be an old issue). x - as.POSIXlt(Sys.Date()) table - as.POSIXlt(Sys.Date()+0:5) length(x) RobMcG [1] 1 x %in% table # I expect TRUE RobMcG [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE match(x, table) # I expect 1 RobMcG [1] NA NA NA NA NA NA NA NA NA RobMcG This behavior seemed more plausible when the length of a POSIXlt object RobMcG was 9 (back in the day), however since the length was redefined, the RobMcG length of x no longer matches the length of the match function output, RobMcG as specified by the ?match documentation: A vector of the same length RobMcG as 'x'. RobMcG I would normally suggest that we add a POSIXlt method for match that RobMcG converts x into POSIXct or character first. However, match does not RobMcG appear to be generic. Below is a possible rewrite of match that appears RobMcG to work as desired. RobMcG match - function(x, table, nomatch = NA_integer_, incomparables = NULL) RobMcG .Internal(match(if(is.factor(x)||inherits(x, POSIXlt)) RobMcG as.character(x) else x, RobMcG if(is.factor(table)||inherits(table, POSIXlt)) RobMcG as.character(table) else table, RobMcG nomatch, incomparables)) RobMcG That said, I understand some people may be very sensitive to the speed RobMcG of the match function, MM yes, indeed. MM I'm currently investigating an alternative, considerably more MM programming time, but in the end should loose much less speed, MM is to .Internal()ize the tests in C code, MM so that the resulting R code would simply be MM match - function(x, table, nomatch = NA_integer_, incomparables = NULL) MM .Internal(x, table, nomatch, incomparables) I have committed such a change to R-devel, to be 2.12.x. This should mean that match() actually is now very slightly faster than it used to be. The speed gain may not be measurable though. Martin Maechler, ETH Zurich RobMcG and may prefer a simple change to the ?match RobMcG documentation noting this (odd) behavior for POSIXlt. RobMcG Thanks, Robert RobMcG R.version RobMcG _ RobMcG platform x86_64-unknown-linux-gnu RobMcG arch x86_64 RobMcG os linux-gnu RobMcG system x86_64, linux-gnu RobMcG status RobMcG major 2 RobMcG minor 11.1 RobMcG year 2010 RobMcG month 05 RobMcG day31 RobMcG svn rev52157 RobMcG language R RobMcG version.string R version 2.11.1 (2010-05-31) RobMcG Robert McGehee, CFA RobMcG Geode Capital Management, LLC RobMcG One Post Office Square, 28th Floor | Boston, MA | 02109 RobMcG Tel: 617/392-8396Fax:617/476-6389 RobMcG mailto:robert.mcge...@geodecapital.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Best way to determine if you're running 32 or 64 bit R on windows
Hi, Is this sufficient? if (.Machine$sizeof.pointer==4){ cat('32\n') } else { cat('64\n') } Or is it better to test something in R.version, say os? I'd like to use this to specify appropriate linker arguments when building the RMySQL windows package. Jeff -- http://biostat.mc.vanderbilt.edu/JeffreyHorner __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Best way to determine if you're running 32 or 64 bit R on windows
Jeffrey Horner jeffrey.horner at gmail.com writes: Is this sufficient? if (.Machine$sizeof.pointer==4){ cat('32\n') } else { cat('64\n') } Or is it better to test something in R.version, say os? No, the above is perfect, as it also works on other platforms to distinguish 32-bit and 64-bit. Regards, Martin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Attributes of 1st argument in ...
R-Devel: I am trying to get an attribute of the first argument in a call to a function whose formal arguments consist of dots only and do something, e.g., call 'cbind', based on the attribute f- function(...) {get first attribute; maybe or maybe not call 'cbind'} I thought of (ignoring deparse.level for the moment) f-function(...) {x - attr(list(...)[[1L]], foo); if (x==bar) cbind(...) else x} but I feared my solution might do some extra copying, with a performance penalty if the dotted objects in the actual call to f' are very large. I thought the following alternative might avoid a potential performance hit by evaluating the attribute in the parent.frame (and therefore avoid extra copying?): f-function(...) { L-match.call(expand.dots=FALSE)[[2L]] x - eval(substitute(attr(x,foo), list(x=L[[1L]]))) if (x==bar) cbind(...) else x } system.time tests showed this second form to be only marginally faster. Is my fear about extra copying unwarranted? If not, is there a better way to get the foo attribute of the first argument other than my two alternatives? Thanks, Dan Murphy [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Attributes of 1st argument in ...
Hi Daniel, On 02.07.2010, at 23:26, Daniel Murphy wrote: I am trying to get an attribute of the first argument in a call to a function whose formal arguments consist of dots only and do something, e.g., call 'cbind', based on the attribute f- function(...) {get first attribute; maybe or maybe not call 'cbind'} I thought of (ignoring deparse.level for the moment) f-function(...) {x - attr(list(...)[[1L]], foo); if (x==bar) cbind(...) else x} what about using the somewhat obscure ..1 syntax? This version runs quite a bit faster for me: g - function(...) { x - attr(..1, foo) if (x == bar) cbind(...) else x } but it will be hard to quantify how this pans out for your unless we know how many and what size and type the arguments are. Cheers, Olaf __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] kmeans
In kmeans() in stats one gets an error message with the default clustering algorithm if centers = 1. Its often useful to calculate the sum of squares for 1 cluster, 2 clusters, etc. and this error complicates things since one has to treat 1 cluster as a special case. A second reason is that easily getting the 1 cluster sum of squares makes it easy to calculate the between cluster sum of squares when there is more than 1 cluster. I suggest adding the line marked ### to the source code of kmeans (the other lines shown are just ther to illustrate context). Adding this line forces kmeans to use the code for algorithm 3 if centers is 1. This is useful since unlike the code for the default algorithm, the code for algorithm 3 succeeds for centers = 1. if(length(centers) == 1) { if (centers == 1) nmeth - 3 ### k - centers Also note that KMeans in Rcmdr produces a betweenss and a tot.withinss and it would be nice if kmeans in stats did that too: library(Rcmdr) str(KMeans(USArrests, 3)) List of 6 $ cluster : Named int [1:50] 1 1 1 2 1 2 3 1 1 2 ... ..- attr(*, names)= chr [1:50] Alabama Alaska Arizona Arkansas ... $ centers : num [1:3, 1:4] 11.81 8.21 4.27 272.56 173.29 ... ..- attr(*, dimnames)=List of 2 .. ..$ : chr [1:3] 1 2 3 .. ..$ : chr [1:4] Murder Assault UrbanPop Rape $ withinss: num [1:3] 19564 9137 19264 $ size: int [1:3] 16 14 20 $ tot.withinss: num 47964 = $ betweenss : num 307844 = - attr(*, class)= chr kmeans __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel