Hi, I guess I'm missing something, but ... why isn't your proposed droplevels.data.table consistent with base? Because the ordering of the rows might change (maybe(?))?
-steve On Tue, Feb 21, 2012 at 4:42 PM, Matthew Dowle <[email protected]> wrote: > > Yes, could do. Building on that here's a quick stab at > droplevels.data.table. This does it by reference, or it could take a > copy(). If it takes a copy() it would be consistent with base (probably > required), but then how best to make a non-copying version available? > > droplevels.data.table = function(dt) { > oldkey = key( dt ) > for (i in names(dt)) { > if (is.factor(dt[[i]])) dt[,i:=droplevels(dt[[i]]),with=FALSE] > } > setkeyv( dt, oldkey ) > dt > } > > On Tue, 2012-02-21 at 15:38 -0500, Prasad Chalasani wrote: >> Meanwhile as a work-around, I suppose one should do: >> >> keys <- key( dt ) # this could in general be a large set of keys >> sub_d <- droplevels( as.data.frame( dt[ name != 'a' ] ) ) >> sub_dt <- data.table( sub_d ) >> setkeyv( sub_dt, keys ) >> >> >> >> On Feb 21, 2012, at 1:59 PM, Matthew Dowle wrote: >> >> > >> > I see the problem too but (just) adding droplevels.data.table might miss >> > the root cause. >> > >> >> because the way the >> >> droplevels.data.frame method works isn't compatible with data.table >> >> indexing. >> > >> > But it's intended to be. I can see the switch at the top of [.data.table >> > is detecting the caller isn't data.table aware, and it is then dispatching >> > to `[.data.frame` but why it then isn't working I'm not sure. Something to >> > do with the missing j or missing drop not being passed through correctly, >> > perhaps. >> > >> > I have heard it said (once or twice) that data.table is "almost" >> > compatible with non-data.table-aware packages, but never had an example >> > before. I wonder if this is it! >> > >> > A (fast) droplevels.data.table using := would be good anyway, though. >> > >> > Matthew >> > >> > >> > >> >> Hi, >> >> >> >> I see what the problem is -- we need to provide a >> >> droplevels.data.table S3 method, because the way the >> >> droplevels.data.frame method works isn't compatible with data.table >> >> indexing. >> >> >> >> Will fix: >> >> >> >> https://r-forge.r-project.org/tracker/index.php?func=detail&aid=1841&group_id=240&atid=975 >> >> >> >> Thanks for raising the flag. >> >> >> >> Cheers, >> >> -steve >> >> >> >> On Tue, Feb 21, 2012 at 12:38 PM, pchalasani <[email protected]> wrote: >> >>> Surprising that this wasn't noticed before, or perhaps I'm not >> >>> following >> >>> some recommended idiom to drop levels when using data.table. The >> >>> following >> >>> code illustrates the bug clearly: The bug remains regardless of whether >> >>> I >> >>> use "subset" or simply use dt1 = dt[ name != 'a' ]. >> >>> >> >>> >> >>> >> >>> d <- data.table(name = c('a','b','c'), value = 1:3) >> >>> dt <- data.table(d) >> >>> setkey(dt,'name') >> >>> dt1 <- subset(dt,name != 'a') # or dt1 <- dt[ name != 'a' ] >> >>> > dt1 >> >>> name value >> >>> [1,] b 2 >> >>> [2,] c 3 >> >>> >> >>> > droplevels(dt1) >> >>> name value >> >>> [1,] b 1 >> >>> [2,] c 3 >> >>> >> >>> >> >>> >> >>> -- >> >>> View this message in context: >> >>> http://r.789695.n4.nabble.com/BUG-droplevels-mangles-subsetted-data-table-tp4407694p4407694.html >> >>> Sent from the datatable-help mailing list archive at Nabble.com. >> >>> _______________________________________________ >> >>> datatable-help mailing list >> >>> [email protected] >> >>> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help >> >> >> >> >> >> >> >> -- >> >> Steve Lianoglou >> >> Graduate Student: Computational Systems Biology >> >> | Memorial Sloan-Kettering Cancer Center >> >> | Weill Medical College of Cornell University >> >> Contact Info: http://cbio.mskcc.org/~lianos/contact >> >> _______________________________________________ >> >> datatable-help mailing list >> >> [email protected] >> >> https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help >> >> >> > >> > >> > > -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact _______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
