The test.data.table() routine returns 714, not 717. I'm running data.table 1.8.2.
The only thing not bleeding edge (I think) is R itself which is at 2.15.0. A search for "merge" on r-forge gets two hits, neither are related; a search for setcolorder gets no hits. Should I file a bug report (or two)? Here's my output from test.data.table() and sessionInfo(): > test.data.table() Running .../tests.Rraw Loading required package: hexbin Loading required package: grid Loading required package: lattice x = 10,000 sample from 100 strings (quick test to save load on CRAN servers where tests run every day. In dev we increase n and m a lot for meaningful times. 0.002 : f=factor(x) [high up front cost, plus storage and maintenance of levels] 0.000 : sort.list(,'radix') on f 0.000 : u=unique(x) 0.000 : .Internal(order(u)) 0.000 : sort.list(,'radix') on fsorted -vs- 0.000 : char group on x (ad hoc by) [slower than radix on f but without up front cost] 0.000 : char sort on x (setkey) [lower up front cost than factor(x)] 0.000 : char group on xsorted (keyed by) [faster than sort.list(,'radix') on fsorted, same result] All 714 tests in test.data.table() completed ok in 15.272sec > sessionInfo() R version 2.15.0 (2012-03-30) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] grid stats graphics grDevices utils datasets methods [8] base other attached packages: [1] hexbin_1.26.0 lattice_0.20-6 nlme_3.1-103 ggplot2_0.9.1 [5] reshape_0.8.4 plyr_1.7.1 data.table_1.8.2 loaded via a namespace (and not attached): [1] colorspace_1.1-1 dichromat_1.2-4 digest_0.5.2 labeling_0.1 [5] MASS_7.3-17 memoise_0.1 munsell_0.3 proto_0.3-9.2 [9] RColorBrewer_1.0-5 reshape2_1.2.1 scales_0.2.1 stringr_0.6.1 -----Original Message----- From: Matthew Dowle [mailto:[email protected]] Sent: Wednesday, August 08, 2012 4:49 AM To: Kaupas, George Cc: [email protected] Subject: Re: [datatable-help] can I count on data.table supporting syntactically invalid column names? Meant to write 2nd paragraph as follows : > > Hi. Yes you should be able to rely on that. It's useful to have > special characters in column names for latex formatting, and spaces > are allowed too. There are tests for these things. If you need to > refer to such column names as variables, then it's up to you to wrap > with ``; e.g., by=`Illegal(name%)`+1. > > So yes, if you find problems with special characters, please report as > bugs, and suggest where the documentation needs improving would be great. > > I seem to remember a bug fix in this regard, and in particular in > merge (so my first thought is to ask you if you've recently upgraded > to 1.8.2 and if test.data.table returns 717), but as you say R-Forge > is currently down for maintenance... > > That neworder error looks familiar too. Are you sure you have 1.8.2 > running in memory? (Run test.data.table() to see if it returns 717). > > Matthew > >> I'm taking advantage of a feature in data.table which lets me get >> away with naming columns with characters that would not survive a >> call to make.names(), e.g.: >> >>> DT1 = data.table(a=letters[1:5], "Illegal(name%)"=1:5, key="a") >>> DT1 >> a Illegal(name%) >> 1: a 1 >> 2: b 2 >> 3: c 3 >> 4: d 4 >> 5: e 5 >> >> (The the dcast function from the reshape2 package will also create >> columns named "illegally".) >> >> But when using merge.data.table, I get two side-effects; either the >> merge works, but the column names appear to be run through >> make.names(), or the merge fails in setcolorder(): >> >>> DT1 = data.table(a=letters[1:5], "Illegal(name%)"=1:5, key="a") >>> DT2 = data.table(a=letters[1:5], b=6L, key="a") >> >>> merge(DT1,DT2) >> a Illegal.name.. b >> 1: a 1 6 >> 2: b 2 6 >> 3: c 3 6 >> 4: d 4 6 >> 5: e 5 6 >> >>> merge(DT2,DT1) >> Error in setcolorder(dt, c(setdiff(names(dt), end), end)) : >> neworder is length 4 but x has 3 columns. >> >> I can't get to datatable.r-forge.r-project.org - getting a 504. >> >> So... should I NOT rely on being able to use special characters in >> column names? >> >> Thanks >> George >> >>> sessionInfo() >> R version 2.15.0 (2012-03-30) >> Platform: x86_64-unknown-linux-gnu (64-bit) >> [1] data.table_1.8.2 _______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
