[R] Question on weighted Kaplan-Meier analysis of case-cohort design
I have a study best described as a retrospective case-cohort design: the cases were all the events in a given time span surveyed, and the controls (event-free during the follow-up period) were selected in 2:1 ratio (2 controls per case). The sampling frequency for the controls was about 0.27, so I used a weight vector consisting of 1 for cases and 1/0.27 for controls for coxph to adjust for sampling bias. Using the same weights in Kaplan-Meier analysis (survfit) gave very inaccurate survival curves (much lower event rate than expected from population). Are weighting handled differently between coxph and survfit? How should I conduct a weighted Kaplan-Meier analysis (given that survfit doesn't accept a weighted cox model) for such a design? Any explanations or suggestions are highly appreciated, xiaojun __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] subselect install problem
Trying to install subselect v0.8 on Redhat 7.3 and R 1.8.1 fails (below). Any help is greatly appreciated. Xiao-Jun * Installing *source* package 'subselect' ... ** libs f2c anneal.f anneal.c anneal: Error on line 263: Declaration error for fica: adjustable dimension on non-argument Error on line 263: Declaration error for valp: adjustable dimension on non-argument Error on line 263: Declaration error for auxw: adjustable dimension on non-argument Error on line 263: wr_ardecls: nonconstant array size Error on line 263: wr_ardecls: nonconstant array size Error on line 263: wr_ardecls: nonconstant array size make: *** [anneal.o] Error 1 ERROR: compilation failed for package 'subselect' __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] fitting gaussian mixtures
Hi R-helpers, I'm trying to model a univariate as a bi-modal normal mixtures. I need to estimate the parameters of each gaussian (mean and sd) and their weights. What's the best way to do this in R? Thanks, Xiao-Jun __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] knn using custom distance metric
Hi, There are two packages providing knn classification: class and knnTree. However, it seems both uses Eucleadian distances only. How can I uses a custom distance function with either package? Thanks, Xiao-Jun __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Getting rid of loops?
Simon and Peter, Thanks for your help. Peter's function speeds it up 25x vs. my naive code! XiaoJun -Original Message- From: Peter Dalgaard To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED]; Xiao-Jun Ma Sent: 02-12-03 15.57 Subject: Re: [R] Getting rid of loops? [EMAIL PROTECTED] writes: I think this will do what you want, though there may be ways of speeding it up further. theta.dist2 - function(x) as.dist(acos(crossprod(t(x))/sqrt(crossprod(t(rowSums(x*x)/pi*180) Or, theta.dist - function(x) as.dist(acos(cov2cor(crossprod(t(x/pi*180) Now, if only there was a way to tell cor() not to center the variables, we'd have as.dist(acos(cor(t(x),center=F))/pi*180) Unfortunately there's no such argument. theta.dist - function(x){ res - matrix(NA, nrow(x), nrow(x)) for (i in 1:nrow(x)){ for(j in 1:nrow(x)){ if (i j) res[i, j] - res[j, i] else { v1 - x[i,] v2 - x[j,] good - !is.na(v1) !is.na(v2) v1 - v1[good] v2 - v2[good] theta - acos(v1%*%v2 / sqrt(v1%*%v1 * v2%*%v2 )) / pi * 180 res[i,j] - theta } } } as.dist(res) } __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help https://www.stat.math.ethz.ch/mailman/listinfo/r-help [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help -- O__ Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] coloring dendrgram in heatmap?
Using the heatmap function in mva, it seems to be hard to use different colors in the edges leading to different groups of objects, as commonly done in many heatmaps in the microarray graphics. Any suggestions? Thanks. max __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
RE: [R] coloring dendrgram in heatmap?
No, I meant coloring the edges of the dendrogram on the left or top of the image plot. -Original Message- From: kjetil brinchmann halvorsen To: '[EMAIL PROTECTED] '; 'Martin Maechler ' Sent: 9/27/03 1:24 PM Subject: Re: [R] coloring dendrgram in heatmap? On 27 Sep 2003 at 11:56, Xiao-Jun Ma wrote: What about trying RColorBrewer, as mentioned in the docs of heatmap. I had good results with that! Kjetil Halvorsen Using the heatmap function in mva, it seems to be hard to use different colors in the edges leading to different groups of objects, as commonly done in many heatmaps in the microarray graphics. Any suggestions? Thanks. max __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
[R] speeding up 1000s of coxph regression?
I have a gene expression matrix n (genes) X p (cases), where n = 8000 and p = 100. I want to fit each gene as univariate in a coxph model, i.e., fitting 8000 models. I do something like this: res - apply(data, 1, coxph.func) which takes about 4 min, not bad. But I need to do large numbers of permutations of the data (permuting the columns), for example, 2000, which would take 5 days. I would like to know if there is way to speed this up? Any help appreciated. Xiao-Jun __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help