Hi Christian.
Thanks for this. That's great! Can you send me a copy of your PNG file to my email? I will add this to: http://groups.google.com/group/aroma-affymetrix/web/redundancy-tests Cheers, Mark > > Dear Mark, > > I have now created the following function, which people can use to > compute firma scores: > > # - - - - - - - - - - - - - - - - - - - > firma <- function(m, method="rlm") { > ## function to compute FIRMA scores (adapted from aroma.affymetrix) > ## m: dataframe containing normalized probe intensities (linear > scale) > ## with 1.column containing the transcript_cluster_id > ## and 2. column containing the corresponding probeset_ids > ## method: fitting model, one of "rlm" or "mdp" > > ## convert expression levels to log2 > y <- as.matrix(log2(m[,3:ncol(m)])); > > ## dimensions > K <- nrow(y); # number of probes > I <- ncol(y); # number of arrays > > ## fit log-additive model > if (method == "rlm") { > fit <- .Call("R_rlm_rma_default_model", y, 0, 1.345, > PACKAGE="preprocessCore"); > } else if (method == "mdp") { > mp <- medpolish(y, trace.iter=FALSE); > fit <- list(Estimates = c(mp$overall + mp$col, mp$row), > StdErrors = rep(0, length(c(mp$row, mp$col)))); > } else { > stop("method must be <rlm, mdp>"); > }#if > > ## extract parameters > est <- fit$Estimates; > se <- fit$StdErrors; > > ## chip effects > beta <- est[1:I]; > > ## probe affinities > if (K == 1) { > ## if only one probe must have affinity=1 since sum constraint > alpha <- 0; > } else { > ## affinities sum to zero (on log scale) > alpha <- est[(I+1):length(est)]; > alpha[length(alpha)] <- -sum(alpha[1:(length(alpha)-1)]); > }#if > > ## estimates on the intensity scale > theta <- 2^beta; > phi <- 2^alpha; > > ## calculate residuals > phi <- matrix(phi, nrow=K, ncol=I, byrow=FALSE); > theta <- matrix(theta, nrow=K, ncol=I, byrow=TRUE); > yhat <- phi *theta; > eps <- (2^y)/yhat; # rma uses y/yhat > > ## estimate of standard error > u.mad <- mad(unlist(log2(eps)), center=0); > > ## get probeset_ids > id <- unique(m[,2]); > > ## firma scores > fs <- sapply(id, > function(x){apply(log2(eps[which(m[,2]==x),,drop=F])/u.mad, 2, > median)}); > fs <- t(fs); > rownames(fs) <- id; > > return(fs); > }#firma > # - - - - - - - - - - - - - - - - - - - > > Now I can apply this function to the normalized intensities of gene > UNR: > > > library(preprocessCore) > > score.mdp <- firma(unr, "mdp") > > score.rml <- firma(unr, "rlm") > > Then I can plot the firma scores for robust fitting ("rlm") compared > to > median-polish ("mdp"): > > > png(file="firmaplots6.png", width=540, height=800) > > oldpar <- par(pty="m", mfcol=c(3, 2), mar=c(5,5,4,2)) > > for (i in 1:6) { > > plot(score.rml[,i], type="l", ylim=c(-3,5), > main=colnames(score.rml)[i], xlab="Probeset", ylab="Score") > > abline(h=0, lty=3) > > lines(score.mdp[,i], lty=2) > > } > > par(oldpar) > > dev.off() > > The attached plots compare "rlm" (solid lines) and "mdp" (dashed > lines) > and should reflect the firma scores for heart and muscle shown in > Figure > 3B of Purdom et al. > As you can see the difference between"rlm" and "mdp" is pretty small. > > Best regards > Christian > > P.S.: It seems that I cannot attach the png-file. > > > On Mar 18, 8:28 am, cstratowa <[email protected] > ingelheim.com> wrote: >> Dear Mark, >> >> Probably, there is not much difference, maybe I will check it. >> I will let you know but it will take some time. >> >> Best regards >> Christian >> >> On Mar 17, 11:46 am, Mark Robinson <[email protected]> wrote: >> >> > Hi Christian. >> >> > > However, in this respect I have also the following question: >> > > How does using "median polish" compare to using >> > > "R_rlm_rma_default_model"? >> > > Are the final scores still of some use if you use medpol? >> >> > Short answer is I haven't investigated this too thoroughly. >> >> > But, my guess is that it wouldn't be "too" different. That prediction >> >> > is based on the fact that the chip effects are in the same ballpark, >> > as you can see from the Aroma_vs_Affy (Aroma=R_rlm_rma, Affy=medpol) >> > plot in the following thread: >> >> >http://groups.google.com/group/aroma-affymetrix/browse_thread/thread/... >> >> > But, I'd be interested to hear more details if you do look into it >> more. >> >> > Cheers, >> > Mark >> >> > > Best regards >> > > Christian >> >> > > On Mar 16, 9:13 am, Mark Robinson <[email protected]> wrote: >> > >> Hi Christian. >> >> > >> From what I can tell looking at your code (rather quickly, i must >> > >> admit), there will be 2 differences between aroma.affymetrix and >> what >> > >> you have: >> >> > >> 1. We use the 'preprocessCore' codebase for the robust fitting of >> the >> > >> linear model (... but maybe you are just using median polish as an >> > >> illustration). For example, you might try: >> >> > >> library(preprocessCore) >> > >> f <- .Call("R_rlm_rma_default_model", log2(yTr), 0, >> > >> 1.345,PACKAGE="preprocessCore") >> > >> [... and piece together the alpha, beta, etc ...] >> >> > >> 2. The "estimate of standard error" is calculated genewise, over >> > >> residuals from all probes/samples (i.e. u.mad should be a scalar >> > >> not a >> > >> vector). >> >> > >> Hope that helps. >> > >> Mark >> >> > >> On 16/03/2009, at 6:32 PM, cstratowa wrote: >> >> > >>> Dear all, >> >> > >>> After reading the FIRMA paper I would like to understand the >> > >>> implementation, but this is not easy since the source code is hard >> >> > >>> to >> > >>> read. Nevertheless, I tried and would like to know if this is >> > >>> correct. >> >> > >>> According to the page on exon array analysis you do the following: >> >> > >>> I, fit a summary of the entire transcript >> > >>>> plmTr <- ExonRmaPlm(csN, mergeGroups=TRUE) >> > >>>> fit(plmTr, verbose=verbose) >> >> > >>> II, fit the FIRMA model for each exon >> > >>>> firma <- FirmaModel(plmTr) >> > >>>> fit(firma, verbose=verbose) >> >> > >>> However, I would like to understand the underlying source code. >> >> > >>> For this example let us assume that we have quantile-normalized >> > >>> intensities yTr for a transcript containing two exons: >> > >>>> yTr >> > >>> HeartA HeartB HeartC MuscleA MuscleB MuscleC >> > >>> 1 5.74954 18.0296 2.50436 15.5857 26.1744 31.0075 >> > >>> 2 9.59819 23.0093 22.01120 70.1742 32.8408 102.0080 >> > >>> 3 114.50800 87.1742 70.34080 312.3410 266.1740 601.3410 >> > >>> 4 66.34080 52.0075 67.34080 184.1740 266.1740 147.0080 >> > >>> 5 210.17400 142.0080 173.34100 514.5080 659.1740 509.6740 >> > >>> 6 104.00800 84.3408 70.34080 333.5080 324.1740 231.0080 >> > >>> 7 194.00800 124.5080 234.00800 443.6740 767.5080 716.8410 >> > >>> 8 319.34100 282.6740 283.50800 656.0080 807.6740 954.6740 >> >> > >>> Here rows 1:4 code for exon 1 and rows 5:8 code for exon 2. >> >> > >>> I, fit a summary of the entire transcript >> > >>> To simplify issues I will fit the data using median polish: >> > >>> # 1. fit median polish >> > >>>> mp <- medpolish(log2(yTr)) >> >> > >>> # 2. data set specific estimates (probe affinities) >> > >>>> beta <- mp$overall+mp$col >> > >>>> thetaTr <- 2^beta >> >> > >>> # 3. array-specific estimates >> > >>>> alpha <- mp$row >> > >>>> alpha[length(alpha)] <- -sum(alpha[1:(length(alpha)-1)]) >> > >>>> phiTr <- 2^alpha >> >> > >>> II, fit FIRMA model for each exon >> > >>> # 1. calculate residuals >> > >>>> phi <- matrix(phiTr, nrow=nrow(yTr), ncol=ncol(yTr)) >> > >>>> theta <- matrix(thetaTr, nrow=nrow(yTr), ncol=ncol(yTr), >> > >>> byrow=TRUE) >> > >>>> yhat <- phi *theta >> > >>>> eps <- yTr/yhat # rma uses y/yhat >> >> > >>> # 2. estimate of standard error >> > >>>> u.mad <- apply(log2(eps), 2, mad, center=0) >> >> > >>> # 3. compute final score statisitc >> > >>> # for 1. exon >> > >>>> y1 <- log2(eps[1:4,]) >> > >>>> F1 <- apply(y1/u.mad, 2, median) >> > >>>> F1 >> > >>> HeartA HeartB HeartC MuscleA MuscleB >> > >>> MuscleC >> > >>> -0.89938777 -0.03792624 -0.69409936 0.11536565 -0.61385296 >> > >>> 1.08709568 >> >> > >>> # for 2. exon >> > >>>> y2 <- log2(eps[5:8,]) >> > >>>> F2 <- apply(y2/u.mad, 2, median) >> > >>>> F2 >> > >>> HeartA HeartB HeartC MuscleA MuscleB >> > >>> MuscleC >> > >>> -0.02899616 -1.64645153 -0.70048533 -0.39996057 0.02666064 >> > >>> -1.46657055 >> >> > >>> Now my question is: >> > >>> Is this calculation of the final score statistic F1 for exon 1 and >> >> > >>> F2 >> > >>> for exon 2 correct? >> > >>> Did I miss something? >> >> > >>> Best regards >> > >>> Christian >> >> > >> ------------------------------ >> > >> Mark Robinson >> > >> Epigenetics Laboratory, Garvan >> > >> Bioinformatics Division, WEHI >> > >> e: [email protected] >> > >> e: [email protected] >> > >> p: +61 (0)3 9345 2628 >> > >> f: +61 (0)3 9347 0852 >> > >> ------------------------------ >> >> > ------------------------------ >> > Mark Robinson >> > Epigenetics Laboratory, Garvan >> > Bioinformatics Division, WEHI >> > e: [email protected] >> > e: [email protected] >> > p: +61 (0)3 9345 2628 >> > f: +61 (0)3 9347 0852 >> > ------------------------------ > > > --~--~---------~--~----~------------~-------~--~----~ When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example. You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/aroma-affymetrix?hl=en -~----------~----~----~----~------~----~------~--~---
