I can't replicate this problem using data.table 1.8.7 (installed about 3 weeks 
ago) on
R version 2.15.2 (2012-10-26)
Platform: i386-w64-mingw32/i386 (32-bit)

Michael
________________________________
From: [email protected] 
[[email protected]] on behalf of Victor 
Kryukov [[email protected]]
Sent: Tuesday, 26 February 2013 9:26 AM
To: [email protected]
Subject: [datatable-help] Potential bug with sorting/summarizing by POSIXct and 
logical column

Hello,

I've encounted what looks like a bug while sorting by POSIXct and logical 
column, which may or may not be related to the following bug:

https://r-forge.r-project.org/tracker/index.php?func=detail&aid=2552&group_id=240&atid=975

Here are all the details: 
http://stackoverflow.com/questions/15077232/data-table-not-summarizing-properly-by-two-columns

Here is the test case:

    # First some data
    data <- data.table(structure(list(
      month = structure(c(1356998400, 1356998400, 1356998400,
                          1359676800, 1354320000, 1359676800, 1359676800, 
1356998400, 1356998400,
                          1354320000, 1354320000, 1354320000, 1359676800, 
1359676800, 1359676800,
                          1356998400, 1359676800, 1359676800, 1356998400, 
1359676800, 1359676800,
                          1359676800, 1359676800, 1354320000, 1354320000), 
class = c("POSIXct",
                                                                                
     "POSIXt"), tzone = "UTC"),
      portal = c(TRUE, TRUE, FALSE, TRUE,
                 TRUE, TRUE, TRUE, TRUE, TRUE, FALSE, TRUE, FALSE, TRUE, FALSE,
                 TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, 
TRUE
      ),
      satisfaction = c(10L, 10L, 10L, 9L, 10L, 10L, 9L, 10L, 10L,
                       9L, 2L, 8L, 10L, 9L, 10L, 10L, 9L, 10L, 10L, 10L, 9L, 
10L, 9L,
                       10L, 10L)),
                      .Names = c("month", "portal", "satisfaction"),
                      row.names = c(NA, -25L), class = "data.frame"))

    # Summarizing by month, portal with tapply works:

    > tapply(data$satisfaction, list(data$month, data$portal), mean)
    FALSE      TRUE
    2012-12-01   8.5  8.000000
    2013-01-01  10.0 10.000000
    2013-02-01   9.0  9.545455

    # Summarizing with 'by' argument of data.table does not:

    > data[, mean(satisfaction), by = 'month,portal']>
      data[, mean(satisfaction), by = list(month, portal)]
    month portal        V1
    1: 2013-01-01  FALSE 10.000000
    2: 2013-02-01   TRUE  9.000000
    3: 2013-01-01   TRUE 10.000000
    4: 2012-12-01  FALSE  8.500000
    5: 2012-12-01   TRUE  7.333333
    6: 2013-02-01   TRUE  9.666667
    7: 2013-02-01  FALSE  9.000000
    8: 2012-12-01   TRUE 10.000000

    # Summarizing only this year's data works:
    data[month >= ymd(20130101), mean(satisfaction), by = 'month,portal']
    month portal        V1
    1: 2013-01-01   TRUE 10.000000
    2: 2013-01-01  FALSE 10.000000
    3: 2013-02-01   TRUE  9.545455
    4: 2013-02-01  FALSE  9.000000

Yours Sincerely,
Victor Kryukov
_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

Reply via email to