Dear Group,

I am trying to simulate a dataset with 200 individuals with random
assignment of Sex (1,0) and Weight from lognormal distribution specific to
Sex.  I am intrigued by the behavior of rlnorm function to impute a value
of Weight from the specified distribution.  Here is the code:
ID<-1:200
Sex<-sample(c(0,1),200,replace=T,prob=c(0.4,0.6))
fulldata<-data.frame(ID,Sex)
fulldata$Wt<-ifelse(fulldata$Sex==1,rlnorm(100, meanlog = log(85.1), sdlog
= sqrt(0.0329)),
                    rlnorm(100, meanlog = log(73), sdlog = sqrt(0.0442)))

mean(fulldata$Wt[fulldata$Sex==0]);to check the mean is close to 73
mean(fulldata$Wt[fulldata$Sex==1]);to check the mean is close to 85

I see that the number of simulated values has an effect on the mean
calculated after imputation. That is, the code rlnorm(100, meanlog =
log(73), sdlog = sqrt(0.0442)) gives much better match compared to
rlnorm(1, meanlog = log(73), sdlog = sqrt(0.0442)) in ifelse statement in
the code above.

My understanding is that ifelse will be imputing only one value where the
condition is met as specified.  I appreciate your insights on the behavior
for better performance of increasing sample number.  I appreciate your
comments.

Regards,
Ayyappa

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to