Andy had written: > >... The drop=FALSE argument has nothing to do with what > >Christian was talking about. The kind of thing he meant is PR# 8192, > >"Subject: [ subscripting sometimes loses names": > > > > http://bugs.r-project.org/cgi-bin/R/wishlist?id=8192 >
On Sun, Feb 1, 2009 at 12:25 PM, Tim Hesterberg <timhesterb...@gmail.com>wrote: > (Later comments on the thread pointed out the difference between > x[,1] for matrices and data frames.) > > I rewrote the S-PLUS data frame code around then, to fix > various inconsistencies and improve efficiency. > This was probably my change, and I would do it again. > > Note that the components of a data frame do not have names > attached to them; the row names are a separate object. > Extracting a component vector or matrix from a data frame should not > attach names to the result, because of: > * memory (attaching row names to an object can more than double the > size of the object), > * speed > * some objects cannot take names, and attaching them could change > the class and other behavior of an object, and > * the names are usually/often (depending on the user) meaningless, > artifacts of an early design decision that all data frames have row names. > > Data frames differ from matrices in two ways that matter here: > * columns in matrices are all the same kind, and are simple objects > (numeric, etc.), whereas components of data frames can be nearly > arbitrary objects, and > * row names get added to a data frame whether a user wants them or not, > whereas row names on a matrix have to be specified. > > A historical note - unique row names on data frame were a design > decision made when people worked with small data frames, and are > convenient for small data frames. But they are a problem for large > data frames. I was writing for all users, not just those with small > data frames and meaningful names. > Hi Tim, Thank you for explaning this so carefully. It's very valuable to hear the rationale beind a design decision explained so carefully. I accept that yours is the right solution for general use. In our case, we deal with not too many rows, up to a few thousand, with meaningful names. And we mostly use data frames. Because of our special situation, we wrote our own "[" methods, which normally do what's right for us. That's why, in one debugging session, it was necessary to "get" the overriden, stock R method from package:base. In that case, the obejct happened to be a matrix not a dataframe, and R got a segmentation fault. And that's why I submitted the bug report that sparked this discussion. /Christian [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel