Thank you Henrik! OK, I followed your instructions, and the same problem (the error in callAB()) remains, or rather, it now occurs already in sample one. Also,
1. My Nuse plot after calmate calibration look different from before (no longer symmetric around 1, see below). What difference does it make to plm that I used RmaCnPlm instead of AvgCnPlm before qam <- QualityAssessmentModel(plm); plotNuse(qam); ? 2. The SNP specific CN estimates get a lot less variable, but the segmentation gets more variable (more and shorter segments). I suppose the latter is a consequence of the less variable CNs, so we get a higher sensitivity in some sence. However, I do not know whether it was better with longer and fewer segments for calling AB. So, at the moment I have no idea what to come up with for tomorrows deadline, I suppose it will be some quick fix and I have to go back to this later. All the best, Ingrid On Saturday, 23 March 2013 05:03:50 UTC+11, Henrik Bengtsson wrote: > > Hi Ingrid. > > On Wed, Mar 20, 2013 at 4:45 PM, Ingrid Lönnstedt > <ingrid.l...@gmail.com <javascript:>> wrote: > > > > Hi! > > > > I have segmented 56 SNP 6 tumour samples with segmentByNonPairedPSCBS(), > > and like to call allelic balance segments with callAB() as required, I > > understand, before I can call neutral TCNs with callNTCN(). > > First, I just want to make it clear (to you and others), that calling > segments to have copy-neutral TCNs ("NTCN") or not is experimental and > under development and was not part of the Paired PSCBS (2011) paper > (which only focused on calling AB and LOH). I've updated the package > to clarified this in the vignette. There I also give a little bit > more explanation on what parameters are estimated in order to call > NTCN, 'kappa' and 'deltaCN', where 'kappa' is tricky to estimate > especially in tumors where nothing much goes on. > > Second, likewise, Non-paired PSCBS should also be considered > experimental (although much less so than calling NTCN). It's only the > Paired PSCBS method that is well tested and published. > > Third, some heads up (I know some of you are waiting for this): We're > improving the NTCN calling as part of our in-house projects, and some > potential improvements have already been discovered and may eventually > become the default. We'll make it clear in the vignette when we have > a "stable" solution. (*) Yes, it's 2013 and calling PSCNs in tumors > is still not straightforward - it's not easy to find an omnibus method > that works in all cases - many tried before us. > > Fourth, you can install the latest devel version of PSCBS as > (recommended): > > source("http://aroma-project.org/hbLite.R"); > hbInstall("PSCBS"); > > > > > For the first > > tumour sample, both callAB() and callNTCN() seem to work fine, but then > on > > the second sample I get errors with callAB(), see below. > > See comment after the error. > > > The plotTracks() > > image looks pretty much the same between sample one and two, and I have > no > > reason to believe that sample two has a major problem in the > segmentation. 4 > > of the 56 samples get very noisy segmentations, with lots of short > regions > > scattered (has anybody seen that before?) > > Yes, over-segmentation is common in CN analysis. I'd argue that you > actually want a bit of over-segmentation and post-cleanup and/or > calling method to handle that. However, I assume you mean extreme > over-segmentation. There is no simple solution for that. The ideal > solution is to do better normalization of the locus-level data > ("whatever method that would be") before you pass it on to the > segmentation method. > > > but that is not the case with > > sample one or two. Any suggestions to how I might assess this error? > > > > Best regards, > > Ingrid > > > > dsNList <- exportTotalAndFracB(cesN, verbose=verbose); > > dataN = extractPSCNArray(dsNList$total) > > > > yT = dataN[,'total',samp] > > CT = yT/median(yT) > > Using median(yT) is not correct and will result in unnecessary large > amount of noise in the TCN signals. > > Dropping index 'T'. Consider a particular sample 'i' (e.g. your tumor > of interest) and SNP 'j'. When calculating the relative TCN for this > sample and tumor, C_{i,j}, you need to rescale the observed SNP signal > y_{i,j} relative to the (unknown) mean SNP signal for this SNP, > mu_{j}, i.e. C_{i,j} = 2 * y_{i,j}/mu_{j}. (*) Multiply by "2" for > diploid genomes (GWS6 = human). > > When you have matched normal ('N'), you often get very good > signal-to-noise ratios if you use mu*_{j} = y_{N,j}. Thus, for the > tumor-normal pair (T,N) you would calculate the relative TCNs for the > tumor as C_{T,j} = 2 * y_{T,j}/y_{N,j}. > > When you don't have matched normal, you need to use another estimate > for mu_{j}. It's common to use the robust average of all the samples > in your data set, i.e. mu*_{j} = median_{i} (y_{i,j}). > > FYI, above you are using mu*_{j} = median_{j} (y_{T,j}) - note > averaging across SNPs in one sample and not across samples in one SNP > - that is a big difference! > > Thus, in your case you need to use the latter. To calculate this > robust average in the aroma framework, do: > > yR <- getAverageFile(dsNList$total); > > and then do: > > CT = 2 * yT/yR > > > > betaT = dataN[,'fracB',samp] > > ugp = getAromaUgpFile(dsNList$total) > > chromosome = ugp[,1,drop = T] > > x = ugp[, 2, drop = T] > > df = data.frame(chromosome = chromosome, x=x, CT=CT, betaT = betaT) > > df = dropSegmentationOutliers(df) > > #This gives warnings: > > #24: In DNAcopy::CNA(genomdat = yKK, chrom = chromosomeKK, ... > : > > #array has repeated maploc positions > > gaps = findLargeGaps(df, minLength = 1e+06) > > knownSegments = gapsToSegments(gaps) > > > > fit = segmentByNonPairedPSCBS(df, knownSegments = knownSegments, > > avgDH="median", > > tauA = .33, seed = > > 48879, verbose = -10) > > #This gives warnings that there are multipe maploc positions > > fit = callAB(fit, verbose = -10) > > > > Calling segments of allelic balance from one-sided DH bootstrap > confidence > > intervals... > > delta (offset adjusting for bias in DH): 0.211106073856354 > > alpha (CI quantile; significance level): 0.05 > > List of 7 > > $ nbrOfDHs : int 70 > > $ dhNbrOfLoci : int 70 > > $ mu : num 0.183 > > $ dhMean : num 0.203 > > $ dMu : num -0.02 > > $ abs(dMu) : num 0.02 > > $ range(x[units]): num [1:2] 990417 2792801 > > Error in bootstrapTCNandDHByRegion.PairedPSCBS(fit, statsFcn = statsFcn, > > : > > INTERNAL ERROR: Incorrect DH mean! > > Calling segments of allelic balance from one-sided DH bootstrap > > confidence intervals...done > > This is a internal sanity check throwing an error. As a start, I > recommend to retry with (i) a proper CT calculate, and (ii) after > updated PSCBS. > > Hope this helps > > Henrik > > > > > > > > > > > head(getSegments(fit)) > > chromosome tcnId dhId tcnStart tcnEnd tcnNbrOfLoci tcnMean > > tcnNbrOfSNPs tcnNbrOfHets dhNbrOfLoci dhStart dhEnd > > 1 1 1 1 61736.0 811403.5 73 1.4735545 > > 15 2 2 61736.0 811403.5 > > 2 1 2 1 811403.5 2821394.0 527 0.9201369 > > 183 70 70 811403.5 2821394.0 > > 3 1 3 1 2821394.0 2827143.5 2 5.0273015 > > 2 0 0 2821394.0 2827143.5 > > 4 1 4 1 2827143.5 11694138.0 5531 0.9780667 > > 2940 983 983 2827143.5 11694138.0 > > 5 1 5 1 11694138.0 11709225.0 5 3.3949294 > > 5 3 3 11694138.0 11709225.0 > > 6 1 6 1 11709225.0 16923050.0 2918 1.0072282 > > 1485 550 550 11709225.0 16923050.0 > > dhMean c1Mean c2Mean > > 1 0.1695309 0.6118708 0.8616838 > > 2 0.2033105 0.3665317 0.5536052 > > 3 NA NA NA > > 4 0.2033807 0.3895734 0.5884933 > > 5 0.1465555 1.4486918 1.9462376 > > 6 0.2036166 0.4010699 0.6061583 > > > > > > > traceback() > > 7: stop("INTERNAL ERROR: Incorrect DH mean!") > > 6: bootstrapTCNandDHByRegion.PairedPSCBS(fit, statsFcn = statsFcn, > > ..., verbose = less(verbose, 50)) > > 5: bootstrapTCNandDHByRegion(fit, statsFcn = statsFcn, ..., verbose = > > less(verbose, > > 50)) > > 4: callAllelicBalanceByDH.PairedPSCBS(fit, ...) > > 3: callAllelicBalanceByDH(fit, ...) > > 2: callAB.PairedPSCBS(fit, verbose = -10) > > 1: callAB(fit, verbose = -10) > > > > > > > > > > > sessionInfo() > > > > R version 2.15.2 Patched (2013-02-09 r61879) > > Platform: x86_64-unknown-linux-gnu (64-bit) > > > > locale: > > [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C > > LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 > > [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=C > > LC_NAME=C > > [9] LC_ADDRESS=C LC_TELEPHONE=C > > LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C > > > > attached base packages: > > [1] splines stats graphics grDevices utils datasets methods > > base > > > > other attached packages: > > [1] Hmisc_3.10-1 survival_2.37-4 DNAcopy_1.32.0 > > PSCBS_0.30.0 aroma.affymetrix_2.8.0 > > [6] affxparser_1.30.2 aroma.apd_0.2.3 R.huge_0.4.1 > > aroma.light_1.28.0 aroma.core_2.8.0 > > [11] matrixStats_0.6.2 R.rsp_0.8.2 R.devices_2.1.3 > > R.cache_0.6.5 R.filesets_2.0.0 > > [16] crlmm_1.16.9 preprocessCore_1.20.0 oligoClasses_1.20.0 > > BiocGenerics_0.4.0 R.utils_1.19.5 > > [21] R.oo_1.11.7 R.methodsS3_1.4.2 > > > > loaded via a namespace (and not attached): > > [1] affyio_1.26.0 annotate_1.36.0 AnnotationDbi_1.20.5 > > Biobase_2.18.0 BiocInstaller_1.8.3 Biostrings_2.26.3 > > [7] bit_1.1-9 cluster_1.14.3 codetools_0.2-8 > > DBI_0.2-5 digest_0.6.3 ellipse_0.3-7 > > [13] ff_2.2-10 foreach_1.4.0 genefilter_1.40.0 > > GenomicRanges_1.10.7 grid_2.15.2 IRanges_1.16.6 > > [19] iterators_1.0.6 lattice_0.20-13 Matrix_1.0-11 > > mvtnorm_0.9-9994 parallel_2.15.2 RcppEigen_0.3.1.2.1 > > [25] RSQLite_0.11.2 stats4_2.15.2 tools_2.15.2 > > XML_3.95-0.1 xtable_1.7-1 zlibbioc_1.4.0 > > > > > > > > > > > > > > > -- > > -- > > When reporting problems on aroma.affymetrix, make sure 1) to run the > > latest version of the package, 2) to report the output of sessionInfo() > and > > traceback(), and 3) to post a complete code example. > > > > > > You received this message because you are subscribed to the Google > Groups > > "aroma.affymetrix" group with website http://www.aroma-project.org/. > > To post to this group, send email to > > aroma-af...@googlegroups.com<javascript:> > > To unsubscribe and other options, go to > > http://www.aroma-project.org/forum/ > > > > --- > > You received this message because you are subscribed to the Google > Groups > > "aroma.affymetrix" group. > > To unsubscribe from this group and stop receiving emails from it, send > an > > email to aroma-affymetr...@googlegroups.com <javascript:>. > > For more options, visit https://groups.google.com/groups/opt_out. > > > > > -- -- When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example. You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group with website http://www.aroma-project.org/. To post to this group, send email to aroma-affymetrix@googlegroups.com To unsubscribe and other options, go to http://www.aroma-project.org/forum/ --- You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group. To unsubscribe from this group and stop receiving emails from it, send an email to aroma-affymetrix+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.