On 01/05/2012 00:28, Antonio Piccolboni wrote:
Hi,
I was wondering if there is anything more efficient than split to do the
kind of conversion in the subject. If I create a data frame as in
system.time({fd = data.frame(x=1:2000, y = rnorm(2000), id = paste("x",
1:2000, sep =""))})
user system elapsed
0.004 0.000 0.004
and then I try to split it
system.time(split(fd, 1:nrow(fd)))
user system elapsed
0.333 0.031 0.415
You will be quick to notice the roughly two orders of magnitude difference
in time between creation and conversion. Granted, it's not written anywhere
Unsurprising when you create three orders of magnitude more data frames,
is it? That's a list of 2000 data frames. Try
system.time(for(i in 1:2000) data.frame(x = i, y = rnorm(1), id =
paste0("x", i)))
that they should be similar but the latter seems interpreter-slow to me
(split is implemented with a lapply in the data frame case) There is also a
memory issue when I hit about 20000 elements (allocating 3GB when
interrupted). So before I resort to Rcpp, despite the electrifying feeling
of approaching the bare metal and for the sake of getting things done, I
thought I would ask the experts. Thanks
You need to re-think your data structures: 1-row data frames are not
sensible.
Antonio
[[alternative HTML version deleted]]
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
--
Brian D. Ripley, rip...@stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel