Günter, Great report! I'm able to reproduce it on 1.8.11 here. Will file a bug and look into it. Thanks again for reporting.
Arun From: Günter J. Hitsch Günter J. Hitsch Reply: Günter J. Hitsch [email protected] Date: January 22, 2014 at 9:52:36 PM To: [email protected] [email protected] Subject: [datatable-help] segfault with "large" number of rows I’ve been using data.table for several months. It’s a great package—thank you for developing it! Here’s my question: I’ve run into a problem when I use “large” data tables with many millions of rows. In particular, for such large data tables I get segmentation faults when I create columns by groups. Example: N = 2500 # No. of groups T = 100000 # No. of observations per group DT = data.table(group = rep(1:N, each = T), x = 1) setkey(DT, group) DT[, sum_x := sum(x), by = group] print(head(DT)) This runs fine. But when I increase the number of groups, say from 2500 to 3000, I get a segfault: N = 3000 # No. of groups T = 100000 # No. of observations per group ... *** caught segfault *** address 0x159069140, cause 'memory not mapped' Traceback: 1: `[.data.table`(DT, , `:=`(sum_x, sum(x)), by = group) 2: DT[, `:=`(sum_x, sum(x)), by = group] 3: eval(expr, envir, enclos) 4: eval(ei, envir) 5: withVisible(eval(ei, envir)) I can reproduce this problem on: (1) OS X 10.9, R 3.0.2, data.table 1.8.10 (2) Ubuntu 13.10, R 3.0.1, data.table 1.8.10 And of course the amount of RAM in my machines is not the issue. Thanks in advance for your help with this! Günter _______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
_______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
