Hello everyone,

I working in a public health project and we have created a Decision Tree for 
categorical variables usign the package rpart. Our goal is to develop a model 
(Using the ROC tool) in order to predict presence/ausent of  diabetes and get a 
better understanding  of what are the important factors in a particular chilean 
population.  There are some importants variable that we have found.  Now we 
want to apply this model over a big dataset in order to determinate a possible 
outcome (probability of getting the deseasse), but we only have the combination 
of predictive variables for a particular person.

We have created this code:

library( rpart)
fit1 <- rpart(sickness~ aetinghabit+gse+age+sex,   method="class", data=data)
prediccion<-predict(fit1,bigdatabase, type="prob")  
predictionsyes<-prediccion[,2]
pred <- prediction(predictionsyes, datos$sickness) # but this is 



My question is. How do I put the people's conditions in this model in order to 
get the people probability of getting this desease? It's possible to do a ROC 
curve using only this bigdatabase? Because we don't have the outcome if this 
people got or not this disease.

It would be very helpful if someone can give us some light about it. Any web 
source of doing it will be very appreciated.

Thanks in advance.
Best Regards,

José Bustos
Escuela de Enfermeria
Pontificia Universidad Católica de Chile
Proyecto FONIS 2010
Celular 95939144






        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to