On 01-Nov-04 Robert Brown FM CEFAS wrote: > I have a data set of about 10000 records which was compiled from > several smaller data sets using SPSS. During compilation 88 false > records were accidentally introduced which comprise all NA values. I > want to delete these records but not other missing data. The functions > na.exclude and na.omit seem to remove all values of NA? How can I > delete just the relevant NA's? . i.e. I want to delete all records in > the data frame DATA where the field age contains NA values
Hi Robert, It's not quite clear what your "NA" criterion for deletion really is. If (as you state first) the false records "comprise all NA values", this suggests that in such a record every field is "NA". On the other hand you say you "want to delete all records in the data frame DATA where the field age contains NA values", so it looks as though you can check for deletion on the field "age" only. Suppose your dataframe is called DF. In the second case, which is simpler, you can simply do newDF <- DF[!is.na(DF$age),] In the first case, it's fundamentally the same but you have to run the check along every element in each row. So define a function notallna<-function(x){!all(is.na(x))} and then newDF <- DF[apply(DF,1,notallna),] This will leave in every record in which not all fields are"NA", so will include records in which only some fields are "NA". Hoping this helps, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <[EMAIL PROTECTED]> Fax-to-email: +44 (0)870 094 0861 [NB: New number!] Date: 01-Nov-04 Time: 16:03:07 ------------------------------ XFMail ------------------------------ ______________________________________________ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html