>>>>> peter dalgaard >>>>> on Thu, 14 Mar 2019 16:18:55 +0100 writes:
> I have no recollection of the original rationale for as.data.frame.table, but I actually think it is fine as it is: > The classifying _factors_ of a crosstable should be factors unless very specifically directed otherwise and that should not depend on the setting of an option that controls the conversion of character data. > For as.data.frame.matrix, in contrast, it is the _content_ of the matrix that is being converted, and it seems much more reasonable to follow the same path as for other character data. > -pd I very strongly agree that as.data.frame.table() should not be changed to follow a global option. To the contrary: I've repeatedly mentioned that in my view it has been a design mistake to allow data.frame() and as.data.frame() be influenced by a global option [and we should've tried harder to keep things purely functional (R remaining as closely as possible a "functional language"), e.g. by providing wrapper functions the same way we have such wrappers for versions of read.table() with different defaults for some of the arguments ] Martin >> On 12 Mar 2019, at 21:39 , Mychaleckyj, Josyf C (jcm6t) <jc...@virginia.edu> wrote: >> >> Reporting a possible inconsistency or bug in handling stringsAsFactors in as.data.frame.table() >> >> Here is a simple test >> >>> options()$stringsAsFactors >> [1] TRUE >>> x<-c("a","b","c","a","b") >>> d<-as.data.frame(table(x)) >>> d >> x Freq >> 1 a 2 >> 2 b 2 >> 3 c 1 >>> class(d$x) >> [1] "factor" >>> d2<-as.data.frame(table(x),stringsAsFactors=F) >>> class(d2$x) >> [1] “character" >>> options(stringsAsFactors=F) >>> options()$stringsAsFactors >> [1] FALSE >>> d3<-as.data.frame(table(x)) >>> d3 >> x Freq >> 1 a 2 >> 2 b 2 >> 3 c 1 >>> class(d3$x) >> [1] “factor" >>> d4<-as.data.frame(table(x),stringsAsFactors=F) >>> class(d4$x) >> [1] “character" >> >> >> # Display the code showing the different stringsAsFactors handling in table and matrix: >> >>> as.data.frame.table >> function (x, row.names = NULL, ..., responseName = "Freq", stringsAsFactors = TRUE, >> sep = "", base = list(LETTERS)) >> { >> ex <- quote(data.frame(do.call("expand.grid", c(dimnames(provideDimnames(x, >> sep = sep, base = base)), KEEP.OUT.ATTRS = FALSE, stringsAsFactors = stringsAsFactors)), >> Freq = c(x), row.names = row.names)) >> names(ex)[3L] <- responseName >> eval(ex) >> } >> <bytecode: 0x28769f8> >> <environment: namespace:base> >> >>> as.data.frame.matrix >> function (x, row.names = NULL, optional = FALSE, make.names = TRUE, >> ..., stringsAsFactors = default.stringsAsFactors()) >> { >> d <- dim(x) >> nrows <- d[[1L]] >> ncols <- d[[2L]] >> ic <- seq_len(ncols) >> dn <- dimnames(x) >> if (is.null(row.names)) >> row.names <- dn[[1L]] >> collabs <- dn[[2L]] >> if (any(empty <- !nzchar(collabs))) >> collabs[empty] <- paste0("V", ic)[empty] >> value <- vector("list", ncols) >> if (mode(x) == "character" && stringsAsFactors) { >> for (i in ic) value[[i]] <- as.factor(x[, i]) >> } >> else { >> for (i in ic) value[[i]] <- as.vector(x[, i]) >> } >> autoRN <- (is.null(row.names) || length(row.names) != nrows) >> if (length(collabs) == ncols) >> names(value) <- collabs >> else if (!optional) >> names(value) <- paste0("V", ic) >> class(value) <- "data.frame" >> if (autoRN) >> attr(value, "row.names") <- .set_row_names(nrows) >> else .rowNamesDF(value, make.names = make.names) <- row.names >> value >> } >> <bytecode: 0x29995c0> >> <environment: namespace:base> >> >> >>> sessionInfo() >> R version 3.5.2 (2018-12-20) >> Platform: x86_64-pc-linux-gnu (64-bit) >> Running under: CentOS Linux 7 (Core) >> >> Matrix products: default >> BLAS: /usr/lib64/libblas.so.3.4.2 >> LAPACK: /usr/lib64/liblapack.so.3.4.2 >> >> locale: >> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 >> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >> [9] LC_ADDRESS=C LC_TELEPHONE=C >> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >> >> attached base packages: >> [1] stats graphics grDevices utils datasets methods base >> >> loaded via a namespace (and not attached): >> [1] compiler_3.5.2 tools_3.5.2 >> >> Thanks, >> Joe >> >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel > -- > Peter Dalgaard, Professor, > Center for Statistics, Copenhagen Business School > Solbjerg Plads 3, 2000 Frederiksberg, Denmark > Phone: (+45)38153501 > Office: A 4.23 > Email: pd....@cbs.dk Priv: pda...@gmail.com > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel