Re: [aroma.affymetrix] extract LRR and BAF from ACNE results

hongen xu Mon, 01 Jun 2015 02:26:57 -0700

Dear Henrik,

many thanks for your detailed explanation.




在 2015年5月29日星期五 UTC+2下午9:20:34，Henrik Bengtsson写道：
>
> So, there are few things here: 
>
> 1. In the Aroma Framework, we always try to store estimates/signals on 
> the "original" scale, e.g. even if, say, chip-effect estimates are 
> original on the log scale, we unlog those before saving.  That is a 
> policy with the rationale that all data can always be represented on 
> the "original" scale but not always on, say, the log scale.  An 
> example of the latter is when a probe signal ends up being slightly 
> negative after calibration and normalization.  If that would have been 
> stored on the log-scale we would have lost the information, i.e. it 
> would be stored as NaN.  Same problem with a signal that is exactly 
> zero (-Inf on the log scale).  Why bother?  For instance, say you end 
> up with replicated probe signals for a single SNP where they all 
> fluctuate around zero.  After averaging, the average signal may end up 
> non-negative. If we stored on the log scale the averaging would have 
> only been on the signals that were positive (because we lost the 
> non-positive one).  So, that's that rational. 
>
> 2. For methods estimating "CNs" it depends on their 
> approach/method/design: 
>
> (a) For instance, the CRMA v1 and CRMA v2 methods (ours; happens to be 
> implemented in the Aroma Framework, but should never be referred to as 
> the "Aroma CN method", "aroma.affymetrix CN method" or similar), we 
> end up with "CN signals" that have not yet been standardized relative 
> to a reference.  In other words, those signals are "all over the 
> place", because each of them are scaled with an unknown scale factors. 
>   It is only when you have a reference when you can create "CN 
> ratios", e.g. C = tumor / matched normal or C = sample / pooled 
> average, etc.   Comment: For CRMA v2, this is not that surprising, 
> because that method was specifically designed to work for each array 
> independently.  That means you get "CN signals" out from a single 
> array, but then you need to compare them to a reference. 
>
> (b) The ACNE method (however), calculates CN ratios for you.  This is 
> by design of the model, simply because how the NMF factorization model 
> works, cf. Eqns (10)-(14) in 
> http://bioinformatics.oxfordjournals.org/content/26/15/1827.full. 
> Comment: The ACNE method is a multi-array method requiring many arrays 
> to work and borrows information across arrays (e.g. estimate SNP 
> effects).  Since this information is already done, it can equally well 
> return CN ratios for you.  Having said this, there is nothing 
> preventing you from taking the ratios relative to another reference 
> (ratio), e.g. effectively C = (T / R) / (R / R*) = T / R. 
>
> So, in the ACNE vignette 
> [http://www.aroma-project.org/vignettes/ACNE/] that you're following 
> there is, near the end: 
>
> cf <- ces[[1]] 
> data <- extractTotalAndFreqB(cf, units=units) 
> CT <- data[,"total"] 
>
> Since you used ACNE to estimate the CNs, here 'CT' is indeed a "CN 
> ratios".  The reason why we still refer to it as a "total" signal, is 
> because that is the most generic name we could come up with for the 
> underlying file format. 
>
> Hope this clarifies 
>
>
> /Henrik 
>
> On Fri, May 29, 2015 at 6:52 AM, hongen xu <hongen....@gmail.com 
> <javascript:>> wrote: 
> > Dear Aroma.affymetrix teem, 
> > 
> > I followed the vignette "Allele-specific copy numbers using non-negative 
> > matrix factorization". I want to extract LRR and BAF to feed into 
> another 
> > software "ASCAT" to correct aneuploidy and normal contamination. 
> > 
> > From the vignette, I can use extractTotalAndFreqB to extract total CN 
> and 
> > BAF. My question is how can I get LRR from Total CN. I went through this 
> > thread  "creat binary data files containing BAF data", but get confused. 
> > 
> > quote from "creat binary data files containing BAF data" 
> > "The take-home message is that 
> > the definition of "log2 ratios" has become de facto standard in our 
> > field, where as "CN" is a bit ambiguous, and can mean "CN ratio" 
> > (C=theta/thetaR, or C=2*theta/thetaR) as well as "CN intensity" 
> > (theta).  By adding tags we try to make this less ambiguous." 
> > 
> > How total CN was defined in aroma.affymetrix? 
> > 
> > 
> > Best regards, 
> > Hongen 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > my code 
> > 
> > ############ 
> > library("aroma.affymetrix") 
> > library("ACNE") 
> > 
> > 
> > log <- verbose <- Arguments$getVerbose(-8, timestamp=TRUE) 
> > # Don't display too many decimals. 
> > options(digits=4) 
> > 
> > 
> > #Verifying annotation data files 
> > cdf <- AffymetrixCdfFile$byChipType("CytoScanHD_Array"); 
> > print(cdf) 
> > gi <- getGenomeInformation(cdf) 
> > print(gi) 
> > si <- getSnpInformation(cdf) 
> > print(si) 
> > acs <- AromaCellSequenceFile$byChipType(getChipType(cdf, 
> fullname=FALSE)) 
> > print(acs) 
> > 
> > ##raw data sets 
> > csR<- AffymetrixCelSet$byName("smida", cdf=cdf); 
> > 
> > ###step 1 Calibration for crosstalk between allele probe pairs 
> > acc <- AllelicCrosstalkCalibration(csR, model="CRMAv2") 
> > 
> > print(acc) 
> > 
> > csC <- process(acc, verbose=verbose) 
> > print(csC) 
> > 
> > 
> > ##Step 2 - Normalization for nucleotide-position probe sequence effects 
> > bpn <- BasePositionNormalization(csC, target="zero") 
> > csN <- process(bpn, verbose=verbose) 
> > cs <- csN 
> > 
> > 
> > 
> > #step 3 
> > #Probe summarization using non-negative-matrix factorization (NMF) 
> > 
> > plm <- NmfSnpPlm(csN, mergeStrands=TRUE) 
> > print(plm) 
> > 
> > if (length(findUnitsTodo(plm)) > 0) { 
> >    # Fit CN probes quickly (~5-10s/array + some overhead) 
> >   units <- fitCnProbes(plm, verbose=verbose) 
> >   str(units) 
> >   # int [1:945826] 935590 935591 935592 935593 935594 935595 ... 
> > 
> >   # Fit remaining units, i.e. SNPs (~5-10min/array) 
> >   units <- fit(plm, verbose=verbose) 
> >   str(units) 
> > } 
> > 
> > ces <- getChipEffectSet(plm) 
> > print(ces) 
> > 
> > 
> > 
> > my session information: 
> > ###################### 
> >> sessionInfo() 
> > R version 3.1.3 (2015-03-09) 
> > Platform: x86_64-redhat-linux-gnu (64-bit) 
> > Running under: Fedora 20 (Heisenbug) 
> > 
> > locale: 
> >  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C 
> >  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8 
> >  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8 
> >  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C 
> >  [9] LC_ADDRESS=C               LC_TELEPHONE=C 
> > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C 
> > 
> > attached base packages: 
> > [1] stats     graphics  grDevices utils     datasets  methods   base 
> > 
> > other attached packages: 
> >  [1] ACNE_0.8.0              aroma.light_2.2.1       
> aroma.affymetrix_2.13.2 
> >  [4] aroma.core_2.13.1       R.devices_2.13.0        R.filesets_2.7.2 
> >  [7] R.utils_2.0.2           R.oo_1.19.0             affxparser_1.38.0 
> > [10] R.methodsS3_1.7.0 
> > 
> > loaded via a namespace (and not attached): 
> >  [1] aroma.apd_0.6.0    base64enc_0.1-2    digest_0.6.8       
> DNAcopy_1.40.0 
> >  [5] MASS_7.3-40        matrixStats_0.14.0 PSCBS_0.44.0       
> R.cache_0.11.0 
> >  [9] R.huge_0.9.0       R.rsp_0.20.0       tools_3.1.3 
> > 
> > 
> > 
> > -- 
> > -- 
> > When reporting problems on aroma.affymetrix, make sure 1) to run the 
> latest 
> > version of the package, 2) to report the output of sessionInfo() and 
> > traceback(), and 3) to post a complete code example. 
> > 
> > 
> > You received this message because you are subscribed to the Google 
> Groups 
> > "aroma.affymetrix" group with website http://www.aroma-project.org/. 
> > To post to this group, send email to aroma-af...@googlegroups.com 
> <javascript:> 
> > To unsubscribe and other options, go to 
> http://www.aroma-project.org/forum/ 
> > 
> > --- 
> > You received this message because you are subscribed to the Google 
> Groups 
> > "aroma.affymetrix" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an 
> > email to aroma-affymetr...@googlegroups.com <javascript:>. 
> > For more options, visit https://groups.google.com/d/optout. 
>

-- 
-- 
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group with website http://www.aroma-project.org/.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe and other options, go to http://www.aroma-project.org/forum/

--- 
You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to aroma-affymetrix+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [aroma.affymetrix] extract LRR and BAF from ACNE results

Reply via email to