Hi Chris, I see where your coworker is coming from. FRs 1224 and 1493 are next on the agenda for v1.6.7. Sort the FR tracker by descending priority to see what we plan to do next, 5 = highest priority. Fast sorting has been the issue.
You might want to look at countingcharacterorder.c which is already part of the package but not yet hooked up. Basically we plan to allow character columns in keys since we now have a fast way to sort character. data.table() will no longer coerce character to factor, and a known performance issue with very large number of levels should be fixed at the same time. It's all a bit tricky because it's closely related to how R's global string cache works. Thinking of doing it in 1.6.7 and that will become v1.7. Don't hold your breath though, it is tricky. Matthew On Thu, 2011-08-25 at 10:38 -0400, Chris Neff wrote: > Hi all, > > I've been pondering the following. One of my coworkers doesn't like > data.table because of the fact that he doesn't like factors. Namely > things like adding a new value to a factor field only to have it choke > because it isn't one of the levels. Also often times the variable is > something like a list of subnested categories, and sometimes he will > do a substitute to go up a level in the categories. This is a pain > when they are factors. > > Suffie to say, his work flow just makes a lot more sense to him when > they are characters and he doesn't have to worry about underlying > levels and the like. > > How hard would an "implicit factor" be? Something that to the user > behaves exactly like a normal character variable, but internally > data.frame is keeping the mapping of character values to integer codes > somewhere behind the scene. > > This is my thrust towards a hack at allowing character vectors to be > keys. If the real right way is much simpler than what this would take > please ignore me. > > -Chris > _______________________________________________ > datatable-help mailing list > [email protected] > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help _______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
