[R] selection of missing data

2005-11-13 Thread [EMAIL PROTECTED]
Hi i'm a french medical student,
i have some data that i import from excel. My colomn of the datafram 
are the localisations of metastasis. If there is a metatsasis there is 
the symbol _. i want to exclude the row without metastasis wich 
represent the NA data.

so, i wrote this

mela is the data fram

mela1=ifelse(mela[,c(11:12,14:21,23,24)]==_,1,0) # selection of the 
colomn of metastasis localisation

mela4=subset(mela3,Skin ==0  s.c == 0  Mucosa ==0  Soft.ti ==0  
Ln.peri==0  Ln.med==0  Ln.abdo==0  Lung==0  Liver==0  
Other.Visc==0  Bone==0  Marrow==0  Brain==0  Other==0) ## selection 
of the row with no metastasis localisation
nrow(mela4)

but i dont now if it is possible to make the same thin as 
ifelse(mela3,Skin  s.c== 0, 0,NA) with more than colomn and after to 
exclude of my data the Na with na.omit.

The last question is how can i omit only the row which are NA value for 
the colomn metastasis c(11:12,14:21,23,24))

Thank you for your help



Bertrand billemont
[[alternative text/enriched version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] selection of missing data

2005-11-13 Thread Adaikalavan Ramasamy
I do not quite follow your post but here are some suggestions. 


1) You can the na.strings argument to simplify things 

   df - read.delim(file=lala.txt, na.strings=- )


2) If you can count the number of metastasis per row first, then find
the rows with zero sum.

   met.cols  - c(11,12,14,21,23,24) # metastasis columns
   number.of.met - rowSums( mela[ , met.cols ] == - )
   have.no.met   - which( number.of.met == 0 )
   mela.no.met   - mela[ have.no.met , ]

If you had coded your - as NA during read in then, the second line
needs to be changed to

   number.of.met - rowSums( is.na( mela[ , met.cols ] ) )

or simply use complete.cases

   met.cols  - c(11,12,14,21,23,24) # metastasis columns
   mela.no.met   - mela[ which( complete.cases(mela[ , met.cols]) ) , ]


3) If you name your columns in a systematic fashion, then you can easily
extract and specify those columns. For example if your columns were
named 

   cn - c( age, colon.met, PSA.level, prostate.met, gender,
hospitalisation.days, status, liver.met, ethnicity)

Then you can extract those names ending with .met as

   met.cols - grep( \\.met$, cn )
   met.cols
   [1] 2 4 8


Regards, Adai



On Sun, 2005-11-13 at 18:40 +0100, [EMAIL PROTECTED] wrote:
 Hi i'm a french medical student,
 i have some data that i import from excel. My colomn of the datafram 
 are the localisations of metastasis. If there is a metatsasis there is 
 the symbol _. i want to exclude the row without metastasis wich 
 represent the NA data.
 
 so, i wrote this
 
 mela is the data fram
 
 mela1=ifelse(mela[,c(11:12,14:21,23,24)]==_,1,0) # selection of the 
 colomn of metastasis localisation
 
 mela4=subset(mela3,Skin ==0  s.c == 0  Mucosa ==0  Soft.ti ==0  
 Ln.peri==0  Ln.med==0  Ln.abdo==0  Lung==0  Liver==0  
 Other.Visc==0  Bone==0  Marrow==0  Brain==0  Other==0) ## selection 
 of the row with no metastasis localisation
 nrow(mela4)
 
 but i dont now if it is possible to make the same thin as 
 ifelse(mela3,Skin  s.c== 0, 0,NA) with more than colomn and after to 
 exclude of my data the Na with na.omit.
 
 The last question is how can i omit only the row which are NA value for 
 the colomn metastasis c(11:12,14:21,23,24))
 
 Thank you for your help
 
 
 
 Bertrand billemont
   [[alternative text/enriched version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html