HI Philippos,
Try this:
dat1<- read.csv("Validation_data_set3.csv",sep=",",stringsAsFactors=FALSE)
#converted to csv
str(dat1)
#'data.frame': 12573 obs. of 17 variables:
# $ Removed.AGC : num 65.67 46.17 41.26 14.09
5.38 ...
# $ Removed.SST : chr "" "46.1658" "41.2566"
"14.0931" ...
# $ Removed.Kurtosis : num NA NA NA NA 5.38 ...
# $ Removed.Skewness : num 65.67 46.17 41.26 14.09
5.38 ...
# $ Removed.QC17999 : num 65.67 46.17 41.26 14.09
5.38 ...
# $ Removed.QC16200 : num 65.67 46.17 41.26 14.09
5.38 ...
# $ Removed.SST.AGC : chr "" "46.1658" "41.2566"
"14.0931" ...
# $ Removed.Kurtosis.Skewness : num NA NA NA NA 5.38 ...
# $ Removed.AGC.QC16200 : num 65.67 46.17 41.26 14.09
5.38 ...
# $ Removed.AGC.QC17999 : num 65.67 46.17 41.26 14.09
5.38 ...
# $ Removed.AGC.QC17999.3.stdevs : num 65.67 46.17 41.26 14.09
5.38 ...
# $ Removed.AGC.QC17999.less.than.1 : num 65.67 46.17 41.26 14.09
5.38 ...
# $ Removed.SST.AGC.QC17999 : chr "" "46.1658" "41.2566"
"14.0931" ...
# $ Removed.SST.AGC.QC16200 : chr "" "46.1658" "41.2566"
"14.0931" ...
# $ Removed.SST.AGC.Kurtosis.Skewness : chr "" "" "" "" ...
# $ Removed.SST.AGC.Kurtosis.Skewness.QC17999: chr "" "" "" "" ...
# $ Removed.SST.AGC.Kurtosis.Skewness.QC16200: chr "" "" "" "" ...
#Found these characters in columns that are not numeric
do.call(rbind,lapply(dat1,function(x) {x1<-
x[is.character(x)];x1[grepl("\\#",x1)]}))
# [,1] [,2] [,3]
#Removed.SST "#DIV/0!" "#DIV/0!" "#DIV/0!"
#Removed.SST.AGC "#DIV/0!" "#DIV/0!" "#DIV/0!"
#Removed.SST.AGC.QC17999 "#DIV/0!" "#DIV/0!" "#DIV/0!"
#Removed.SST.AGC.QC16200 "#DIV/0!" "#DIV/0!" "#DIV/0!"
#Removed.SST.AGC.Kurtosis.Skewness "#DIV/0!" "#DIV/0!" "#DIV/0!"
#Removed.SST.AGC.Kurtosis.Skewness.QC17999 "#DIV/0!" "#DIV/0!" "#DIV/0!"
#Removed.SST.AGC.Kurtosis.Skewness.QC16200 "#DIV/0!" "#DIV/0!" "#DIV/0!"
# [,4]
#Removed.SST "#DIV/0!"
#Removed.SST.AGC "#DIV/0!"
#Removed.SST.AGC.QC17999 "#DIV/0!"
#Removed.SST.AGC.QC16200 "#DIV/0!"
#Removed.SST.AGC.Kurtosis.Skewness "#DIV/0!"
#Removed.SST.AGC.Kurtosis.Skewness.QC17999 "#DIV/0!"
#Removed.SST.AGC.Kurtosis.Skewness.QC16200 "#DIV/0!"
dat2<-as.data.frame(sapply(dat1,function(x) {
x[is.character(x)][grep("\\#",x[is.character(x)])]<- NA; x1<- as.numeric(x)}))
str(dat2)
#'data.frame': 12573 obs. of 17 variables:
# $ Removed.AGC : num 65.67 46.17 41.26 14.09
5.38 ...
# $ Removed.SST : num NA 46.17 41.26 14.09 5.38
...
# $ Removed.Kurtosis : num NA NA NA NA 5.38 ...
# $ Removed.Skewness : num 65.67 46.17 41.26 14.09
5.38 ...
# $ Removed.QC17999 : num 65.67 46.17 41.26 14.09
5.38 ...
# $ Removed.QC16200 : num 65.67 46.17 41.26 14.09
5.38 ...
# $ Removed.SST.AGC : num NA 46.17 41.26 14.09 5.38
...
# $ Removed.Kurtosis.Skewness : num NA NA NA NA 5.38 ...
# $ Removed.AGC.QC16200 : num 65.67 46.17 41.26 14.09
5.38 ...
# $ Removed.AGC.QC17999 : num 65.67 46.17 41.26 14.09
5.38 ...
# $ Removed.AGC.QC17999.3.stdevs : num 65.67 46.17 41.26 14.09
5.38 ...
# $ Removed.AGC.QC17999.less.than.1 : num 65.67 46.17 41.26 14.09
5.38 ...
# $ Removed.SST.AGC.QC17999 : num NA 46.17 41.26 14.09 5.38
...
# $ Removed.SST.AGC.QC16200 : num NA 46.17 41.26 14.09 5.38
...
# $ Removed.SST.AGC.Kurtosis.Skewness : num NA NA NA NA 5.38 ...
# $ Removed.SST.AGC.Kurtosis.Skewness.QC17999: num NA NA NA NA 5.38 ...
# $ Removed.SST.AGC.Kurtosis.Skewness.QC16200: num NA NA NA NA 5.38 ...
head(dat2,3)
# Removed.AGC Removed.SST Removed.Kurtosis Removed.Skewness Removed.QC17999
#1 65.6738 NA NA 65.6738 65.6738
#2 46.1658 46.1658 NA 46.1658 46.1658
#3 41.2566 41.2566 NA 41.2566 41.2566
# Removed.QC16200 Removed.SST.AGC Removed.Kurtosis.Skewness Removed.AGC.QC16200
#1 65.6738 NA NA 65.6738
#2 46.1658 46.1658 NA 46.1658
#3 41.2566 41.2566 NA 41.2566
# Removed.AGC.QC17999 Removed.AGC.QC17999.3.stdevs
#1 65.6738 65.6738
#2 46.1658 46.1658
#3 41.2566 41.2566
# Removed.AGC.QC17999.less.than.1 Removed.SST.AGC.QC17999
#1 65.6738 NA
#2 46.1658 46.1658
#3 41.2566 41.2566
# Removed.SST.AGC.QC16200 Removed.SST.AGC.Kurtosis.Skewness
#1 NA NA
#2 46.1658 NA
#3 41.2566 NA
# Removed.SST.AGC.Kurtosis.Skewness.QC17999
#1 NA
#2 NA
#3 NA
# Removed.SST.AGC.Kurtosis.Skewness.QC16200
#1 NA
#2 NA
#3 NA
I work as a postdoc at Wayne State University, Detroit,
Regards,
A.K.
________________________________
From: Philippos Tsourkas <[email protected]>
To: "[email protected]" <[email protected]>
Sent: Tuesday, April 16, 2013 6:07 PM
Subject: R question
Hello Arun, and thank you for your offer to help. I am sending you the xlsx
file I am trying to use. I save it as a csv, read it in R using
read.csv, amd then extract the columns. Some columns are numeric and
contain NA instead of blank spaces (e.g. column 1), while other columns (e.g.
column 2) contain
blank spaces instead of NA and is not numeric. I can't figure out what's
causing this or how to deal with
it. Basically, all columns should be numeric with NAs instead of blank
spaces.
What do you do by the way?
Thanks again,
Philippos
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.