I agree; this would be very useful.
On Mon, Apr 7, 2014 at 11:29 AM, Chris Neff <[email protected]> wrote: > I would appreciate such a function, yes. Thanks for the explanation. > > > On Mon, Apr 7, 2014 at 2:25 PM, Arunkumar Srinivasan <[email protected]> > wrote: >> >> as.data.frame is a S3 with .data.table method and is definitely faster >> than data.frame(). But it still does copy(.). data.frame(.) would also >> convert strings to factors by default (if stringsAsFactors=TRUE). >> >> The most efficient way to convert data.table to data.frame would be to do >> things by reference (in place). The code is already available in >> as.data.frame, just remove the copy(.): >> >> # convert data.table to data.frame by reference >> setDF <- function(x) { >> if (!is.data.table(x)) >> stop("x must be a data.table") >> setattr(x, "row.names", .set_row_names(nrow(x))) >> setattr(x, "class", "data.frame") >> setattr(x, "sorted", NULL) >> setattr(x, ".internal.selfref", NULL) >> } >> >> Now you've a function that'll convert a data.table to data.frame by >> reference. >> >> require(data.table) >> dat <- data.table(x=1:5, y=6:10) >> setDF(dat) # dat is now a data.frame >> >> Probably we should export this function as well, like setDT so that users >> can switch between the two as they desire without hitting performance? >> >> >> Arun >> >> From: Chris Neff [email protected] >> Reply: Chris Neff [email protected] >> Date: April 7, 2014 at 5:32:47 PM >> To: [email protected] >> [email protected] >> Subject: [datatable-help] Is there any overhead to converting back and >> forth from a data.table to a data.frame? >> >> I prefer data.tables for all the code processing I do. But others on my >> team using my functions aren't comfortable with data.tables, so most of the >> libraries I write end with >> >> return(data.frame(DT)) >> >> Is there any copying or other overhead happening there? Since it inherits >> from data.frame, I think the answer is no. >> >> Now, if I have a function that does such a return, but I wrap that itself >> in a data.table call: >> >> data.table(func_that_returns_df()) >> >> Is there any inefficiency there? Is there a difference between >> data.table() and as.data.table() here? >> _______________________________________________ >> datatable-help mailing list >> [email protected] >> >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help > > > > _______________________________________________ > datatable-help mailing list > [email protected] > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help _______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
