Thanks. Have added that (1970 potential issue) to statquant's FR to follow up...
https://r-forge.r-project.org/tracker/index.php?func=detail&aid=2582&group_id=240&atid=978 On 26.02.2013 00:46, Alexander Chernyakov wrote: > Regarding fasttime: my understanding is that only works after 1970. > > On Mon, Feb 25, 2013 at 7:41 PM, <[email protected] [32]> wrote: > >> Send datatable-help mailing list submissions to >> [email protected] [1] >> >> To subscribe or unsubscribe via the World Wide Web, visit >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help [2] >> >> or, via email, send a message with subject or body 'help' to >> [email protected] [3] >> >> You can reach the person managing the list at >> [email protected] [4] >> >> When replying, please edit your Subject line so it is more specific >> than "Re: Contents of datatable-help digest..." >> >> Today's Topics: >> >> 1. About adding fastmatch and fasttime to data.table (stat quant) >> 2. Potential bug with sorting/summarizing by POSIXct and logical >> column (Victor Kryukov) >> 3. Re: About adding fastmatch and fasttime to data.table >> (Matthew Dowle) >> 4. Re: Potential bug with sorting/summarizing by POSIXct and >> logical column (Michael Nelson) >> >> ---------------------------------------------------------------------- >> >> Message: 1 >> Date: Mon, 25 Feb 2013 19:40:35 +0100 >> From: stat quant <[email protected] [5]> >> To: [email protected] [6] >> Subject: [datatable-help] About adding fastmatch and fasttime to >> data.table >> Message-ID: >> <cajjhha9ql8hurxf0+8onpad1t7y5csoolx7qdknuqxc1xpm...@mail.gmail.com [7]> >> Content-Type: text/plain; charset="iso-8859-1" >> >> Hello list, >> >> Looking at fastmatch and fasttime, I realized that those package consists >> solely in 1 C file (each). >> We spoke about the possibility to add those to data.table, I tried to >> contact S.Urbanek without any success so I do not have feedback from his >> side. >> Using fastPOSIXct provide a huge gain when one have to load files with >> datetime, on my laptop using data.table:::fread, I realized that most of >> the time is spent casting datetimes to POSIXct (I have several columns). >> >> Looking at fasttime, you can see pretty good improvement (factor 15) >> >> R) ts R) system.time(a utilisateur syst?me ?coul? >> 6.49 0.04 6.57 >> R) system.time(b utilisateur syst?me ?coul? >> 0.40 0.00 0.41 >> >> When colClasses will be implemented in fread, can I suggest to allow using >> fasttime as an option ? >> Concerning fastmatch, the vignette already shows some nice benchmarks, I >> tend to do a lot of selects based on string columns, not sure if this is >> the case for most of us. >> >> My 0.002 cent >> Cheers >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130225/f45e5d57/attachment-0001.html [8]> >> >> ------------------------------ >> >> Message: 2 >> Date: Mon, 25 Feb 2013 14:26:28 -0800 >> From: Victor Kryukov <[email protected] [9]> >> To: [email protected] [10] >> Subject: [datatable-help] Potential bug with sorting/summarizing by >> POSIXct and logical column >> Message-ID: >> [email protected]> >> Content-Type: text/plain; charset="iso-8859-1" >> >> Hello, >> >> I've encounted what looks like a bug while sorting by POSIXct and logical >> column, which may or may not be related to the following bug: >> >> https://r-forge.r-project.org/tracker/index.php?func=detail&aid=2552&group_id=240&atid=975 [11] >> >> Here are all the details: >> http://stackoverflow.com/questions/15077232/data-table-not-summarizing-properly-by-two-columns [12] >> >> Here is the test case: >> >> # First some data >> data month = structure(c(1356998400, 1356998400, 1356998400, >> 1359676800, 1354320000, 1359676800, 1359676800, >> 1356998400, 1356998400, >> 1354320000, 1354320000, 1354320000, 1359676800, >> 1359676800, 1359676800, >> 1356998400, 1359676800, 1359676800, 1356998400, >> 1359676800, 1359676800, >> 1359676800, 1359676800, 1354320000, 1354320000), >> class = c("POSIXct", >> >> "POSIXt"), tzone = "UTC"), >> portal = c(TRUE, TRUE, FALSE, TRUE, >> TRUE, TRUE, TRUE, TRUE, TRUE, FALSE, TRUE, FALSE, TRUE, >> FALSE, >> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, >> TRUE, TRUE >> ), >> satisfaction = c(10L, 10L, 10L, 9L, 10L, 10L, 9L, 10L, 10L, >> 9L, 2L, 8L, 10L, 9L, 10L, 10L, 9L, 10L, 10L, 10L, >> 9L, 10L, 9L, >> 10L, 10L)), >> .Names = c("month", "portal", "satisfaction"), >> row.names = c(NA, -25L), class = "data.frame")) >> >> # Summarizing by month, portal with tapply works: >> >> > tapply(data$satisfaction, list(data$month, data$portal), mean) >> FALSE TRUE >> 2012-12-01 8.5 8.000000 >> 2013-01-01 10.0 10.000000 >> 2013-02-01 9.0 9.545455 >> >> # Summarizing with 'by' argument of data.table does not: >> >> > data[, mean(satisfaction), by = 'month,portal']> >> data[, mean(satisfaction), by = list(month, portal)] >> month portal V1 >> 1: 2013-01-01 FALSE 10.000000 >> 2: 2013-02-01 TRUE 9.000000 >> 3: 2013-01-01 TRUE 10.000000 >> 4: 2012-12-01 FALSE 8.500000 >> 5: 2012-12-01 TRUE 7.333333 >> 6: 2013-02-01 TRUE 9.666667 >> 7: 2013-02-01 FALSE 9.000000 >> 8: 2012-12-01 TRUE 10.000000 >> >> # Summarizing only this year's data works: >> data[month >= ymd(20130101), mean(satisfaction), by = 'month,portal'] >> month portal V1 >> 1: 2013-01-01 TRUE 10.000000 >> 2: 2013-01-01 FALSE 10.000000 >> 3: 2013-02-01 TRUE 9.545455 >> 4: 2013-02-01 FALSE 9.000000 >> >> Yours Sincerely, >> Victor Kryukov >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130225/45b99e3e/attachment-0001.html [13]> >> >> ------------------------------ >> >> Message: 3 >> Date: Tue, 26 Feb 2013 00:39:09 +0000 >> From: Matthew Dowle <[email protected] [14]> >> To: <[email protected] [15]> >> Cc: [email protected] [16] >> Subject: Re: [datatable-help] About adding fastmatch and fasttime to >> data.table >> Message-ID: <[email protected] [17]> >> Content-Type: text/plain; charset="utf-8" >> >> Hi, >> >> This sounds like a geat idea. I don't know why Simon U didn't >> reply, or without success, so that may depend on the way you asked, >> whether he is on holiday at the moment, his reaction to the precise >> wording of the email you wrote, or some other factor. It is difficult to >> tell! But we don't need to wait for him or for for you: this is open >> source. You have got much further than I have so if you'd like to add >> this please go ahead and make progress. You're very welcome to join the >> project and commit directly. Or if you can't for some reason please file >> as a feature request so it doesn't get forgotten. >> >> Matthew >> >> On >> 25.02.2013 18:40, stat quant wrote: >> >> > Hello list, >> > >> > Looking at >> fastmatch and fasttime, I realized that those package consists solely in >> 1 C file (each). >> > We spoke about the possibility to add those to >> data.table, I tried to contact S.Urbanek without any success so I do not >> have feedback from his side. >> > Using fastPOSIXct provide a huge gain >> when one have to load files with datetime, on my laptop using >> data.table:::fread, I realized that most of the time is spent casting >> datetimes to POSIXct (I have several columns). >> > >> > Looking at >> fasttime, you can see pretty good improvement (factor 15) >> > >> > R) ts R) >> system.time(a utilisateur syst?me ?coul? >> > 6.49 0.04 6.57 >> > R) >> system.time(b utilisateur syst?me ?coul? >> > 0.40 0.00 0.41 >> > >> > When >> colClasses will be implemented in fread, can I suggest to allow using >> fasttime as an option ? >> > Concerning fastmatch, the vignette already >> shows some nice benchmarks, I tend to do a lot of selects based on >> string columns, not sure if this is the case for most of us. >> > >> > My >> 0.002 cent >> > Cheers >> >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130226/643480c3/attachment-0001.html [18]> >> >> ------------------------------ >> >> Message: 4 >> Date: Tue, 26 Feb 2013 00:40:02 +0000 >> From: Michael Nelson <[email protected] [19]> >> To: "[email protected] [20]" >> <[email protected] [21]> >> Subject: Re: [datatable-help] Potential bug with sorting/summarizing >> by POSIXct and logical column >> Message-ID: >> <6fb5193a6cdcdf499486a833b7afbdcd5827d...@ex-mbx-pro-04.mcs.usyd.edu.au [22]> >> >> Content-Type: text/plain; charset="iso-8859-1" >> >> I can't replicate this problem using data.table 1.8.7 (installed about 3 weeks ago) on >> R version 2.15.2 (2012-10-26) >> Platform: i386-w64-mingw32/i386 (32-bit) >> >> Michael >> ________________________________ >> From: [email protected] [23] [[email protected] [24]] on behalf of Victor Kryukov [[email protected] [25]] >> Sent: Tuesday, 26 February 2013 9:26 AM >> To: [email protected] [26] >> Subject: [datatable-help] Potential bug with sorting/summarizing by POSIXct and logical column >> >> Hello, >> >> I've encounted what looks like a bug while sorting by POSIXct and logical column, which may or may not be related to the following bug: >> >> https://r-forge.r-project.org/tracker/index.php?func=detail&aid=2552&group_id=240&atid=975 [27] >> >> Here are all the details: http://stackoverflow.com/questions/15077232/data-table-not-summarizing-properly-by-two-columns [28] >> >> Here is the test case: >> >> # First some data >> data month = structure(c(1356998400, 1356998400, 1356998400, >> 1359676800, 1354320000, 1359676800, 1359676800, 1356998400, 1356998400, >> 1354320000, 1354320000, 1354320000, 1359676800, 1359676800, 1359676800, >> 1356998400, 1359676800, 1359676800, 1356998400, 1359676800, 1359676800, >> 1359676800, 1359676800, 1354320000, 1354320000), class = c("POSIXct", >> "POSIXt"), tzone = "UTC"), >> portal = c(TRUE, TRUE, FALSE, TRUE, >> TRUE, TRUE, TRUE, TRUE, TRUE, FALSE, TRUE, FALSE, TRUE, FALSE, >> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE >> ), >> satisfaction = c(10L, 10L, 10L, 9L, 10L, 10L, 9L, 10L, 10L, >> 9L, 2L, 8L, 10L, 9L, 10L, 10L, 9L, 10L, 10L, 10L, 9L, 10L, 9L, >> 10L, 10L)), >> .Names = c("month", "portal", "satisfaction"), >> row.names = c(NA, -25L), class = "data.frame")) >> >> # Summarizing by month, portal with tapply works: >> >> > tapply(data$satisfaction, list(data$month, data$portal), mean) >> FALSE TRUE >> 2012-12-01 8.5 8.000000 >> 2013-01-01 10.0 10.000000 >> 2013-02-01 9.0 9.545455 >> >> # Summarizing with 'by' argument of data.table does not: >> >> > data[, mean(satisfaction), by = 'month,portal']> >> data[, mean(satisfaction), by = list(month, portal)] >> month portal V1 >> 1: 2013-01-01 FALSE 10.000000 >> 2: 2013-02-01 TRUE 9.000000 >> 3: 2013-01-01 TRUE 10.000000 >> 4: 2012-12-01 FALSE 8.500000 >> 5: 2012-12-01 TRUE 7.333333 >> 6: 2013-02-01 TRUE 9.666667 >> 7: 2013-02-01 FALSE 9.000000 >> 8: 2012-12-01 TRUE 10.000000 >> >> # Summarizing only this year's data works: >> data[month >= ymd(20130101), mean(satisfaction), by = 'month,portal'] >> month portal V1 >> 1: 2013-01-01 TRUE 10.000000 >> 2: 2013-01-01 FALSE 10.000000 >> 3: 2013-02-01 TRUE 9.545455 >> 4: 2013-02-01 FALSE 9.000000 >> >> Yours Sincerely, >> Victor Kryukov >> -------------- next part -------------- >> An HTML attachment was scrubbed... >> URL: <http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130226/c1945761/attachment.html [29]> >> >> ------------------------------ >> >> _______________________________________________ >> datatable-help mailing list >> [email protected] [30] >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help [31] >> >> End of datatable-help Digest, Vol 36, Issue 8 >> ********************************************* Links: ------ [1] mailto:[email protected] [2] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help [3] mailto:[email protected] [4] mailto:[email protected] [5] mailto:[email protected] [6] mailto:[email protected] [7] mailto:cajjhha9ql8hurxf0%[email protected] [8] http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130225/f45e5d57/attachment-0001.html [9] mailto:[email protected] [10] mailto:[email protected] [11] https://r-forge.r-project.org/tracker/index.php?func=detail&aid=2552&group_id=240&atid=975 [12] http://stackoverflow.com/questions/15077232/data-table-not-summarizing-properly-by-two-columns [13] http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130225/45b99e3e/attachment-0001.html [14] mailto:[email protected] [15] mailto:[email protected] [16] mailto:[email protected] [17] mailto:[email protected] [18] http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130226/643480c3/attachment-0001.html [19] mailto:[email protected] [20] mailto:[email protected] [21] mailto:[email protected] [22] mailto:6fb5193a6cdcdf499486a833b7afbdcd5827d...@ex-mbx-pro-04.mcs.usyd.edu.au [23] mailto:[email protected] [24] mailto:[email protected] [25] mailto:[email protected] [26] mailto:[email protected] [27] https://r-forge.r-project.org/tracker/index.php?func=detail&aid=2552&group_id=240&atid=975 [28] http://stackoverflow.com/questions/15077232/data-table-not-summarizing-properly-by-two-columns [29] http://lists.r-forge.r-project.org/pipermail/datatable-help/attachments/20130226/c1945761/attachment.html [30] mailto:[email protected] [31] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help [32] mailto:[email protected]
_______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
