I ran into a problem today when using a conditional for sub-setting a
data.frame and tracked it down to a difference in behavior between the
treatment of NA when sub-setting matrices and data.frames. A self-contained
example is below followed by sessionInfo(). I'm not questioning the
documentation of the behavior, but the rationale for its existence.

Could someone explain to me why the difference is logical and useful? This
seems more of a devel than a help issue, my apologies if I've posted to the
incorrect list.

Mark
#
a.vec <- c("A", "", "B", "DEF", NA, "", NA, "Q")
a.vec[a.vec == ""] <- NA
a.vec
## [1] "A"   NA    "B"   "DEF" NA    NA    NA    "Q"

a.mat <- matrix(rep(c("A", "", "B", "DEF", NA, "", NA, "Q"), 5), nrow = 5,
ncol = 8)
a.mat[a.mat[,3] == "", 3] <- NA
a.mat
##      [,1]  [,2] [,3]  [,4]  [,5] [,6]  [,7] [,8]
## [1,] "A"   ""   "B"   "Q"   NA   ""    NA   "DEF"
## [2,] ""    NA   "DEF" "A"   ""   "B"   "Q"  NA
## [3,] "B"   "Q"  NA    ""    NA   "DEF" "A"  ""
## [4,] "DEF" "A"  NA    "B"   "Q"  NA    ""   NA
## [5,] NA    ""   NA    "DEF" "A"  ""    "B"  "Q"

a.df <- data.frame(matrix(rep(c("A", "", "B", "DEF", NA, "", NA, "Q"), 5),
nrow = 5, ncol = 8))
a.df[a.df[,3] == "", 3] <- NA
a.df
## Error in `[<-.data.frame`(`*tmp*`, a.df[, 3] == "", 3, value = NA) :
##   missing values are not allowed in subscripted assignments of data
frames

## Enter a frame number, or 0 to exit

## 1: `[<-`(`*tmp*`, a.df[, 3] == "", 3, value = NA)
## 2: `[<-.data.frame`(`*tmp*`, a.df[, 3] == "", 3, value = NA)
## remove plain text non-codes from codes.df
sessionInfo()
## R version 2.10.0 Patched (2009-10-27 r50222)
## x86_64-unknown-linux-gnu

## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
##  [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

## attached base packages:
## [1] stats     graphics  grDevices datasets  utils     methods   base

## loaded via a namespace (and not attached):
## [1] tools_2.10.0
Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work, & Mobile & VoiceMail
(317) 399-1219 Skype No Voicemail please

        [[alternative HTML version deleted]]

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to