On Thu, 2006-09-28 at 12:27 +1000, [EMAIL PROTECTED] wrote: > I am hoping for some advice regarding the difficulties I have been having > recoding variables which are contained in a csv file. Table 1 (below) > shows there are two types of blanks - as reported in the first two > columns. I am using windows XP & the latets version of R. > > When blanks cells are replaced with a value of n using syntax: > affect > [affect==""] <- "n" > there are still 3 blank values (Table 2). When as.numeric is applied, > this also causes problems because values of 2,3 & 4 are generated rather > than just 1 & 2. > > TABLE 1 > > table(group,actions) > actions > group n y > 1 100 2 0 3 > 2 30 1 1 0 > 3 24 0 0 0 > > > > TABLE 2 > > > table(group,actions) > actions > group n y > 1 0 2 100 3 > 2 0 1 31 0 > 3 0 0 24 0 > > > Below is another example - for some reason there are 2 types of 'aobh' > values. > > > > table(group, type) > type > group aobh aobh gbh m uw > 1 104 1 0 0 0 > 2 0 0 15 0 17 > 3 0 0 0 24 0 > > > Any assistance is much appreciated, > > > Bob Green
Bob, A quick heads up, which is the presumption that "aobh" and "aobh " are different values simply as a consequence of leading/trailing spaces in the source data file within the delimited fields. This is also the likely reason for there being multiple missing/blank values in your imported data set. Presuming that you used one of the read.table() family functions (ie. read.csv() ), take note of the 'strip.white' argument in ?read.table, which defaults to FALSE. If you change it to TRUE, the function will strip leading and trailing blanks, likely resolving this issue. HTH, Marc Schwartz ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
