Re: [aroma.affymetrix] An error message from running PSCBS
On Sat, Jan 14, 2012 at 10:02 PM, Hoon Kim wise...@gmail.com wrote: Dear Henrik, Thank you very much for your time to reply to my questions, and your suggestions worked very well. By following your advice, I successfully evaluated LOHs for all samples. I wonder if I can ask you two more questions, one related to your previous email and the other related to the output of PSCBS. 1. A question related to your previous email: --A part of your previous email # Estimate the LOH *threshold* from autosomal chromosomes alone DeltaLOH - estimateDeltaLOH(extractChromosomes(fit, 1:22)); # Call LOH fitC - callLOH(fit, delta=DeltaLOH); - In this case, do I also have to estimate the Alleleic Balance *threshold* (deltaAB) from autosomal chromosomes alone in order to do callAB? You could do that, but since the AB caller relies on weaker assumptions it is more likely to work well anyway. The advantage of estimating the DeltaAB and DeltaLOH threshold from only the autosomal chromosomes is that (ceteris paribus) the estimates will be the same regardless of the germline being XX or XY; include X and Y in those estimates will give a tiny (and negligible) bias between XXs and XYs. 2. Please, find the attached file that contains the PSCBS output for two patients. In Patient 1, most lohCalls are NA rather than False, while only a few of NAs are shown in Patient 2. I would greatly appreciate it if you let me know why Patient 1 has many NAs rather than False. Unfortunately, I couldn't find a reference where I can find answer myself, but I might have missed. See argument 'xorCalls' in callLOH(): If @TRUE, a region already called AB, will not be called LOH. The basic idea is that if a segment is already called AB, it cannot be LOH. Thus, if AB = TRUE then LOH = FALSE, i.e. (AB,LOH)=(TRUE,FALSE). However, we still run the LOH caller and gets the calls. Now, to distinguish the cases where the AB and LOH agree from when they disagree, we report either (AB,LOH)=(TRUE,FALSE) or (TRUE,NA). The former is reported when they agree and the latter when they disagree. You can always override this yourself by the logic: LOH[AB] - FALSE. In the next release help(callLOH, package=PSCBS) will clarify this better. Cheers, Henrik Thank you very much. Best, Hoon On Fri, Jan 13, 2012 at 4:31 PM, Henrik Bengtsson henrik.bengts...@aroma-project.org wrote: Hi, I could reproduce this using the PairedPSCBS object that you sent me; library(PSCBS); # Load Paired PSCBS results with AB calls already done fit - loadObject(HoonKim_20120110.Rbin); fitC - callLOH(fit); ## Error: muC1atNonAB muC1atAB is not TRUE # The CN data (figure attached) looks good toPNG(HoonKim_20120110, tags=PSCBS, width=840, aspectRatio=0.8, { plotTracks(fit); }) What causing the problem is basically that nothing is going on in the tumor, that is, it behaves as the germline. No change points were detected on any chromomosome (Chr 1-22, 23 (X), 24 (Y), 25 (M)), resulting in one segment per chromosome. Each such segment/chromosome was then correctly called to be in allelic balance (AB) by callAB() that you did earlier. The exception is ChrY, which is non-AB. The total CNs and the BAFs for Chr24 are probably garbage, because this sample is an XX person, meaning we only observe noise on ChrY. One obvious solution is to exclude ChrY, either already when running the segmentation, or immediately afterward, e.g. df - subset(df, chromosome %in% 1:23); fit - segmentByPairedPSCBS(df, verbose=verbose); fit - callAB(fit); fit - callLOH(fit); Alternatively, if you wish to keep ChrY, say for consistency with other samples, you can do as follows: # Estimate the LOH *threshold* from autosomal chromosomes alone DeltaLOH - estimateDeltaLOH(extractChromosomes(fit, 1:22)); # Call LOH fitC - callLOH(fit, delta=DeltaLOH); In both cases you will get a warning that DeltaLOH was set to -Inf, which is alright. See below. DETAILS: When calling LOH, we compare (a bootstrap quantile estimate of) the minor CN (C1) of each segment toward a threshold DeltaLOH. See Eqn (7) in Olshen et al. (2011) [http://aroma-project.org/publications/] for details. The main problem in calling LOH is to choose a good DeltaLOH. As explained in the paper, this threshold is sample specific and requires rather strong assumptions and ad hoc estimators. In the Section S1.2 of the Supplementary Materials, we propose on approach to estimate DeltaLOH from data. The idea is to choose DeltaLOH as the midpoint between where we believe the true CN=1 and true CN=0 is. We estimate the true CN=1 (mu1) from all segment that are in AB and with TCN near two. That estimate is not that debatable and works for data sets. However, estimating where the true CN=0 (mu0) is, requires that we have some segments that have a zero minor CN. The method for
Re: [aroma.affymetrix] An error message from running PSCBS
On Sun, Jan 15, 2012 at 11:20 AM, Henrik Bengtsson henrik.bengts...@aroma-project.org wrote: On Sat, Jan 14, 2012 at 10:02 PM, Hoon Kim wise...@gmail.com wrote: Dear Henrik, Thank you very much for your time to reply to my questions, and your suggestions worked very well. By following your advice, I successfully evaluated LOHs for all samples. I wonder if I can ask you two more questions, one related to your previous email and the other related to the output of PSCBS. 1. A question related to your previous email: --A part of your previous email # Estimate the LOH *threshold* from autosomal chromosomes alone DeltaLOH - estimateDeltaLOH(extractChromosomes(fit, 1:22)); # Call LOH fitC - callLOH(fit, delta=DeltaLOH); - In this case, do I also have to estimate the Alleleic Balance *threshold* (deltaAB) from autosomal chromosomes alone in order to do callAB? You could do that, but since the AB caller relies on weaker assumptions it is more likely to work well anyway. The advantage of estimating the DeltaAB and DeltaLOH threshold from only the autosomal chromosomes is that (ceteris paribus) the estimates will be the same regardless of the germline being XX or XY; include X and Y in those estimates will give a tiny (and negligible) bias between XXs and XYs. 2. Please, find the attached file that contains the PSCBS output for two patients. In Patient 1, most lohCalls are NA rather than False, while only a few of NAs are shown in Patient 2. I would greatly appreciate it if you let me know why Patient 1 has many NAs rather than False. Unfortunately, I couldn't find a reference where I can find answer myself, but I might have missed. See argument 'xorCalls' in callLOH(): If @TRUE, a region already called AB, will not be called LOH. The basic idea is that if a segment is already called AB, it cannot be LOH. Thus, if AB = TRUE then LOH = FALSE, i.e. (AB,LOH)=(TRUE,FALSE). However, we still run the LOH caller and gets the calls. Now, to distinguish the cases where the AB and LOH agree from when they disagree, we report either (AB,LOH)=(TRUE,FALSE) or (TRUE,NA). The former is reported when they agree and the latter when they disagree. You can always override this yourself by the logic: LOH[AB] - FALSE. Forgot to add that (AB,LOH)=(TRUE,NA) call also be reported when the LOH caller itself could not decide on either TRUE or FALSE, and hence return NA, e.g. when there is too little evidence/data. For your Patient #1, it is likely that the estimate of the DeltaLOH threshold is not correct/sensible, which would explain your (TRUE,FALSE) and (TRUE,NA) calls, as well as the (FALSE,FALSE) and (FALSE,TRUE) calls. /Henrik In the next release help(callLOH, package=PSCBS) will clarify this better. Cheers, Henrik Thank you very much. Best, Hoon On Fri, Jan 13, 2012 at 4:31 PM, Henrik Bengtsson henrik.bengts...@aroma-project.org wrote: Hi, I could reproduce this using the PairedPSCBS object that you sent me; library(PSCBS); # Load Paired PSCBS results with AB calls already done fit - loadObject(HoonKim_20120110.Rbin); fitC - callLOH(fit); ## Error: muC1atNonAB muC1atAB is not TRUE # The CN data (figure attached) looks good toPNG(HoonKim_20120110, tags=PSCBS, width=840, aspectRatio=0.8, { plotTracks(fit); }) What causing the problem is basically that nothing is going on in the tumor, that is, it behaves as the germline. No change points were detected on any chromomosome (Chr 1-22, 23 (X), 24 (Y), 25 (M)), resulting in one segment per chromosome. Each such segment/chromosome was then correctly called to be in allelic balance (AB) by callAB() that you did earlier. The exception is ChrY, which is non-AB. The total CNs and the BAFs for Chr24 are probably garbage, because this sample is an XX person, meaning we only observe noise on ChrY. One obvious solution is to exclude ChrY, either already when running the segmentation, or immediately afterward, e.g. df - subset(df, chromosome %in% 1:23); fit - segmentByPairedPSCBS(df, verbose=verbose); fit - callAB(fit); fit - callLOH(fit); Alternatively, if you wish to keep ChrY, say for consistency with other samples, you can do as follows: # Estimate the LOH *threshold* from autosomal chromosomes alone DeltaLOH - estimateDeltaLOH(extractChromosomes(fit, 1:22)); # Call LOH fitC - callLOH(fit, delta=DeltaLOH); In both cases you will get a warning that DeltaLOH was set to -Inf, which is alright. See below. DETAILS: When calling LOH, we compare (a bootstrap quantile estimate of) the minor CN (C1) of each segment toward a threshold DeltaLOH. See Eqn (7) in Olshen et al. (2011) [http://aroma-project.org/publications/] for details. The main problem in calling LOH is to choose a good DeltaLOH. As explained in the paper, this threshold is sample specific and requires rather strong
Re: [aroma.affymetrix] An error message from running PSCBS
Dear Henrik, Thank you very much for your clear explanations. Because the estimate of the DeltaLOH threshold seems not correct for several samples (as you mentioned in the previous email), I am thinking of only using the *dhMean* column rather than the *lohCall* for interpreting my data. Again, thank you very much, and have a good holiday. Best, Hoon Kim On Sun, Jan 15, 2012 at 1:20 PM, Henrik Bengtsson henrik.bengts...@aroma-project.org wrote: On Sat, Jan 14, 2012 at 10:02 PM, Hoon Kim wise...@gmail.com wrote: Dear Henrik, Thank you very much for your time to reply to my questions, and your suggestions worked very well. By following your advice, I successfully evaluated LOHs for all samples. I wonder if I can ask you two more questions, one related to your previous email and the other related to the output of PSCBS. 1. A question related to your previous email: --A part of your previous email # Estimate the LOH *threshold* from autosomal chromosomes alone DeltaLOH - estimateDeltaLOH(extractChromosomes(fit, 1:22)); # Call LOH fitC - callLOH(fit, delta=DeltaLOH); - In this case, do I also have to estimate the Alleleic Balance *threshold* (deltaAB) from autosomal chromosomes alone in order to do callAB? You could do that, but since the AB caller relies on weaker assumptions it is more likely to work well anyway. The advantage of estimating the DeltaAB and DeltaLOH threshold from only the autosomal chromosomes is that (ceteris paribus) the estimates will be the same regardless of the germline being XX or XY; include X and Y in those estimates will give a tiny (and negligible) bias between XXs and XYs. 2. Please, find the attached file that contains the PSCBS output for two patients. In Patient 1, most lohCalls are NA rather than False, while only a few of NAs are shown in Patient 2. I would greatly appreciate it if you let me know why Patient 1 has many NAs rather than False. Unfortunately, I couldn't find a reference where I can find answer myself, but I might have missed. See argument 'xorCalls' in callLOH(): If @TRUE, a region already called AB, will not be called LOH. The basic idea is that if a segment is already called AB, it cannot be LOH. Thus, if AB = TRUE then LOH = FALSE, i.e. (AB,LOH)=(TRUE,FALSE). However, we still run the LOH caller and gets the calls. Now, to distinguish the cases where the AB and LOH agree from when they disagree, we report either (AB,LOH)=(TRUE,FALSE) or (TRUE,NA). The former is reported when they agree and the latter when they disagree. You can always override this yourself by the logic: LOH[AB] - FALSE. In the next release help(callLOH, package=PSCBS) will clarify this better. Cheers, Henrik Thank you very much. Best, Hoon On Fri, Jan 13, 2012 at 4:31 PM, Henrik Bengtsson henrik.bengts...@aroma-project.org wrote: Hi, I could reproduce this using the PairedPSCBS object that you sent me; library(PSCBS); # Load Paired PSCBS results with AB calls already done fit - loadObject(HoonKim_20120110.Rbin); fitC - callLOH(fit); ## Error: muC1atNonAB muC1atAB is not TRUE # The CN data (figure attached) looks good toPNG(HoonKim_20120110, tags=PSCBS, width=840, aspectRatio=0.8, { plotTracks(fit); }) What causing the problem is basically that nothing is going on in the tumor, that is, it behaves as the germline. No change points were detected on any chromomosome (Chr 1-22, 23 (X), 24 (Y), 25 (M)), resulting in one segment per chromosome. Each such segment/chromosome was then correctly called to be in allelic balance (AB) by callAB() that you did earlier. The exception is ChrY, which is non-AB. The total CNs and the BAFs for Chr24 are probably garbage, because this sample is an XX person, meaning we only observe noise on ChrY. One obvious solution is to exclude ChrY, either already when running the segmentation, or immediately afterward, e.g. df - subset(df, chromosome %in% 1:23); fit - segmentByPairedPSCBS(df, verbose=verbose); fit - callAB(fit); fit - callLOH(fit); Alternatively, if you wish to keep ChrY, say for consistency with other samples, you can do as follows: # Estimate the LOH *threshold* from autosomal chromosomes alone DeltaLOH - estimateDeltaLOH(extractChromosomes(fit, 1:22)); # Call LOH fitC - callLOH(fit, delta=DeltaLOH); In both cases you will get a warning that DeltaLOH was set to -Inf, which is alright. See below. DETAILS: When calling LOH, we compare (a bootstrap quantile estimate of) the minor CN (C1) of each segment toward a threshold DeltaLOH. See Eqn (7) in Olshen et al. (2011) [http://aroma-project.org/publications/] for details. The main problem in calling LOH is to choose a good DeltaLOH. As explained in the paper, this threshold is sample specific
Re: [aroma.affymetrix] An error message from running PSCBS
On Sun, Jan 15, 2012 at 11:38 AM, Hoon Kim wise...@gmail.com wrote: Dear Henrik, Thank you very much for your clear explanations. Because the estimate of the DeltaLOH threshold seems not correct for several samples (as you mentioned in the previous email), I am thinking of only using the *dhMean* column rather than the *lohCall* for interpreting my data. It is probably better using the minor CN (C1) estimates for calling LOH, and instead of doing that manually, use a preset DeltaLOH threshold in callLOH(). That is pretty much what you are proposing while keeping the advantage of the bootstrap estimate of the C1 mean. See Eqn (6) in Olshen et al. (2011) [http://aroma-project.org/publications/]. If you're still want to use DH instead of C1, see the code of callLOH(); there are flavors of the caller that uses DH instead. Note, calling LOH is very complicated, both from a theoretical as well as a practical point of view. The complication of LOH is unfortunately rarely appreciated and often overlooked in most of the literature. /Henrik Again, thank you very much, and have a good holiday. Best, Hoon Kim On Sun, Jan 15, 2012 at 1:20 PM, Henrik Bengtsson henrik.bengts...@aroma-project.org wrote: On Sat, Jan 14, 2012 at 10:02 PM, Hoon Kim wise...@gmail.com wrote: Dear Henrik, Thank you very much for your time to reply to my questions, and your suggestions worked very well. By following your advice, I successfully evaluated LOHs for all samples. I wonder if I can ask you two more questions, one related to your previous email and the other related to the output of PSCBS. 1. A question related to your previous email: --A part of your previous email # Estimate the LOH *threshold* from autosomal chromosomes alone DeltaLOH - estimateDeltaLOH(extractChromosomes(fit, 1:22)); # Call LOH fitC - callLOH(fit, delta=DeltaLOH); - In this case, do I also have to estimate the Alleleic Balance *threshold* (deltaAB) from autosomal chromosomes alone in order to do callAB? You could do that, but since the AB caller relies on weaker assumptions it is more likely to work well anyway. The advantage of estimating the DeltaAB and DeltaLOH threshold from only the autosomal chromosomes is that (ceteris paribus) the estimates will be the same regardless of the germline being XX or XY; include X and Y in those estimates will give a tiny (and negligible) bias between XXs and XYs. 2. Please, find the attached file that contains the PSCBS output for two patients. In Patient 1, most lohCalls are NA rather than False, while only a few of NAs are shown in Patient 2. I would greatly appreciate it if you let me know why Patient 1 has many NAs rather than False. Unfortunately, I couldn't find a reference where I can find answer myself, but I might have missed. See argument 'xorCalls' in callLOH(): If @TRUE, a region already called AB, will not be called LOH. The basic idea is that if a segment is already called AB, it cannot be LOH. Thus, if AB = TRUE then LOH = FALSE, i.e. (AB,LOH)=(TRUE,FALSE). However, we still run the LOH caller and gets the calls. Now, to distinguish the cases where the AB and LOH agree from when they disagree, we report either (AB,LOH)=(TRUE,FALSE) or (TRUE,NA). The former is reported when they agree and the latter when they disagree. You can always override this yourself by the logic: LOH[AB] - FALSE. In the next release help(callLOH, package=PSCBS) will clarify this better. Cheers, Henrik Thank you very much. Best, Hoon On Fri, Jan 13, 2012 at 4:31 PM, Henrik Bengtsson henrik.bengts...@aroma-project.org wrote: Hi, I could reproduce this using the PairedPSCBS object that you sent me; library(PSCBS); # Load Paired PSCBS results with AB calls already done fit - loadObject(HoonKim_20120110.Rbin); fitC - callLOH(fit); ## Error: muC1atNonAB muC1atAB is not TRUE # The CN data (figure attached) looks good toPNG(HoonKim_20120110, tags=PSCBS, width=840, aspectRatio=0.8, { plotTracks(fit); }) What causing the problem is basically that nothing is going on in the tumor, that is, it behaves as the germline. No change points were detected on any chromomosome (Chr 1-22, 23 (X), 24 (Y), 25 (M)), resulting in one segment per chromosome. Each such segment/chromosome was then correctly called to be in allelic balance (AB) by callAB() that you did earlier. The exception is ChrY, which is non-AB. The total CNs and the BAFs for Chr24 are probably garbage, because this sample is an XX person, meaning we only observe noise on ChrY. One obvious solution is to exclude ChrY, either already when running the segmentation, or immediately afterward, e.g. df - subset(df, chromosome %in% 1:23); fit - segmentByPairedPSCBS(df, verbose=verbose); fit - callAB(fit); fit -
[aroma.affymetrix] An error message from running PSCBS
Dear Henrik, I have been doing paired PSCBS analysis on a Genome Wide SNP array 6.0 data set in order to estimate LOH for 20 pairs of samples. The code on the aroma web site successfully estimated LOH for most pairs, while it fails to estimate LOH for several bad pairs. An error message I got from a bad pair is : fit - callLOH(fit, verbose=verbose); Error: muC1atNonAB muC1atAB is not TRUE The code, which is not exactly the same as my actual one, that I use is as follows: === library(aroma.affymetrix); verbose - Arguments$getVerbose(-10, timestamp=TRUE); library(PSCBS); res - doASCRMAv2(csR, verbose=verbose); data- extractPSCNArray(res$total); # Total CNs for the tumor relative to the matched normal CT - 2 * (data[,total,T] / data[,total,N]); # Allele B fractions for the tumor betaT - data[,fracB,T]; # Allele B fractions for the normal betaN - data[,fracB,N]; # Get (chromosome, position) annotation data ugp - getAromaUgpFile(res$total); chromosome - ugp[,1,drop=TRUE]; x - ugp[,2,drop=TRUE]; # Setup data structure for Paired PSCBS df- data.frame(chromosome=chromosome, x=x, CT=CT, betaT=betaT, betaN=betaN); # Run PSCBS fit - segmentByPairedPSCBS(df, verbose=verbose); print(...segmentByPairedPSCBS) fit - callAB(fit, verbose=verbose); print(...callAB) fit - callLOH(fit, verbose=verbose); print(...callLOH) === I wonder if you provide me with any suggestion regarding this error message. Thank you, Best, Hoon sessionInfo() is as follows: === sessionInfo() R version 2.13.0 (2011-04-13) Platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] splines stats graphics grDevices utils datasets methods [8] base other attached packages: [1] Hmisc_3.8-3survival_2.36-5DNAcopy_1.26.0 [4] PSCBS_0.17.1 aroma.affymetrix_2.3.0 affxparser_1.24.0 [7] aroma.apd_0.2.0R.huge_0.3.0 aroma.core_2.3.2 [10] aroma.light_1.22.0 matrixStats_0.4.3 R.rsp_0.7.0 [13] R.cache_0.5.2 R.filesets_1.1.3 digest_0.5.1 [16] R.utils_1.9.6 R.oo_1.8.3 R.methodsS3_1.2.1 loaded via a namespace (and not attached): [1] cluster_1.13.3 grid_2.13.0 lattice_0.19-23 === -- When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example. You received this message because you are subscribed to the Google Groups aroma.affymetrix group with website http://www.aroma-project.org/. To post to this group, send email to aroma-affymetrix@googlegroups.com To unsubscribe and other options, go to http://www.aroma-project.org/forum/