Yep, I think you're right, for this sample there are many points away from 
the y=x line. 

<https://lh4.googleusercontent.com/-rLT39dQUYe0/Uq9aLgEKsPI/AAAAAAAAKWY/Uv1cJZGoVTo/s1600/2013-12-16_KB170TvsKB170B_beta_TvsN.png>
Thankfully this isn't the case for all the samples though. I suppose other 
than finding the right match algorithmically (as this is public data), I 
could use a pooled reference instead of the matched normal. 

Two more questions that I haven't been able to find in the documentation..
1- Is there a way to remove (or ideally prevent) very small segments? I'm 
finding several small (2-10 probes) segments with very high copy number 
(>10). I'm doubtful that they are all real.
2- I am finding very few deletions (<5 per genome, vs dozens of 
amplifications) in my samples. I'm calling gains/dels based on the standard 
deviation of the median probe log-ratio. Is this skewed distribution a 
common phenomena and should my threshold for deletions be more lenient than 
my threshold for gains?


Thanks!

Emilie



On Friday, December 13, 2013 4:55:26 PM UTC-5, Henrik Bengtsson wrote:
>
> On Fri, Dec 13, 2013 at 11:15 AM, Emilie <emilie....@gmail.com<javascript:>
> > wrote:
>
>> Thanks again Henrick. I do see 3 bands, but not sure they are necessarily 
>> clean/distinct. 
>>
>>
>> <https://lh5.googleusercontent.com/-yPngBXg2loA/Uqtcqi2z4YI/AAAAAAAAKWI/mcUOS0DWiyQ/s1600/2013-12-13_KB170B_BAF.png>
>>
>
> These normal BAFs look ok to me.
>  
>
>> <https://lh5.googleusercontent.com/-yPngBXg2loA/Uqtcqi2z4YI/AAAAAAAAKWI/mcUOS0DWiyQ/s1600/2013-12-13_KB170B_BAF.png>
>> A similar but slightly noisier pattern is observed in the tumour sample. 
>> I also down-sampled the data 50x to be able to see the patterns (vs a black 
>> blob). Would you consider this as noise/a bad run?
>>
>
> Down-sampling is alright when plotting whole-genome data.  
>
> If the tumor BAFs ('betaT') are not much noisier than the normal BAFs 
> ('betaN'), I suspect a mismatched pair.  Next, look at tumor vs normal 
> BAFs, e.g.
>
> plot(betaN, betaT, xlim=c(0,1), ylim=c(0,1))
>
> What do you get?  You should see most data points scattered along the 
> diagonal line, similar to the ones in Figure 4 of the online CalMaTe 
> vignette [ http://aroma-project.org/vignettes/CalMaTe ].  If you see data 
> points all over the place, particularly in upper-left and the lower-right 
> corners, your tumor normal pair is not from the same patient.
>
> /Henrik
>
>
>> Emilie
>>
>>
>>
>> On Thursday, December 12, 2013 6:01:48 PM UTC-5, Henrik Bengtsson wrote:
>>
>>> The tumor DH panel makes me believe that either your tumor or your 
>>> normal chip data is bad, or  alternatively that the tumor and normal are 
>>> not matched.
>>>
>>> Check the allele B fraction of your normal. It should show three 
>>> distinct bands.  Do the same for the tumor. It should also show distinct 
>>> bands with varying  of bands depending on aberrations.  If both look clean, 
>>> then it's likely they're not matched.  If one is very noisy, then that one 
>>> is simply a bad run/sample.
>>>
>>> Henrik
>>> On Dec 12, 2013 12:42 PM, "Emilie" <emilie....@gmail.com> wrote:
>>>
>>>> Thank you both very much! I was indeed referring to smooth.cna, sorry 
>>>> about that confusion.
>>>>
>>>> I've switched over to PSCBS and used the dropSegmentationOutliers- it 
>>>> seems to be running well. I've noticed that some of my samples have very 
>>>> fragmented profiles (see attached). Does this suggest poor quality data, 
>>>> or 
>>>> maybe an error in my normalization/plotting? Not all samples are like 
>>>> this, 
>>>> but it almost seems like the order of the of the probes is scrambled?
>>>>
>>>>
>>>> Emilie
>>>>
>>>>
>>>> On Thursday, December 5, 2013 1:08:46 PM UTC-5, Henrik Bengtsson wrote:
>>>>>
>>>>> Pierre beat me to this one.  Comments below... 
>>>>>
>>>>> On Thu, Dec 5, 2013 at 9:20 AM, Pierre Neuvial 
>>>>> <pierre....@genopole.cnrs.fr> wrote: 
>>>>> > Hi Emilie, 
>>>>> > 
>>>>> > OK, so you are referring to the  “smooth.CNA" function in the 
>>>>> DNAcopy 
>>>>> > package, cf 
>>>>> > http://www.bioconductor.org/packages/2.13/bioc/vignettes/DNA
>>>>> copy/inst/doc/DNAcopy.pdf 
>>>>> > 
>>>>> > What this function is doing is detecting outliers (based on how far 
>>>>> their 
>>>>> > signal value is from their neighbors) and shrink their signal values 
>>>>> toward 
>>>>> > those of their neighbors. 
>>>>> > 
>>>>> > This is indeed appropriate and recommended.  I thought that by 
>>>>> "smoothing" 
>>>>> > you meant performing some kind of local averaging of the original 
>>>>> signal 
>>>>> > (e.g. using a mobile median or by binning): this I don't recommend. 
>>>>>  Sorry 
>>>>> > for the confusion. 
>>>>> > 
>>>>> > 
>>>>> > To drop outliers, one possibility is to use the 
>>>>> "dropSegmentationOutliers" 
>>>>> > function from the PSCBS package.  See the vignettes at 
>>>>> > http://cran.fhcrc.org/web/packages/PSCBS/index.html 
>>>>> > 
>>>>> > Another comment: since you are following the vignette for paired CNA 
>>>>> > analysis, I am guessing that you are working with tumor/normal 
>>>>> pairs.  If 
>>>>> > so, then you should use PSCBS rather than CBS for segmentation. 
>>>>>  PSCBS is an 
>>>>> > extension of CBS to segment not only total copy numbers but also 
>>>>> allelic 
>>>>> > ratios. See the PSCBS vignette in the above URL. 
>>>>>
>>>>> To balance this a little bit, I would say there may exist outliers in 
>>>>> the total copy number (TCN) signals that are so sever that they bias 
>>>>> the estimators/test statistic of CBS (which assumes Gaussian signals). 
>>>>>  If one believes there are such outliers and worries that they are so 
>>>>> extreme that they would affect the segmentation severely, one could 
>>>>> either (i) drop or (ii) shrink ("smooth") them.  In the vignettes of 
>>>>> the PSCBS package, I've last night [PSCBS (>= 0.39.8)] 
>>>>> corrected/clarified Section 'Dropping TCN outliers' to say the 
>>>>> following: 
>>>>>
>>>>> "There may be some outliers among the TCNs.  In 
>>>>> CBS~\citep{OlshenA_etal_2004,VenkatramanOlshen_2007}, the authors 
>>>>> propose a method for identifying outliers and then to shrink such 
>>>>> values toward their neighbors ("smooth") before performing 
>>>>> segmentation.  At the time CBS was developed it made sense to not just 
>>>>> to drop outliers because the resolution was low and every datapoint 
>>>>> was valuable.  With modern technologies the resolution is much higher 
>>>>> and we can afford dropping such outliers, which can be done by: 
>>>>>
>>>>> > data <- dropSegmentationOutliers(data) 
>>>>>
>>>>> Dropping TCN outliers is optional." 
>>>>>
>>>>> Hope this clarifies. 
>>>>>
>>>>> Back to the original question: It is not possible to drop (or smooth) 
>>>>> outliers using the CbsModel() pipeline [I'll add that to the todo 
>>>>> list].  The easiest is to turn use the PSCBS package, where you can do 
>>>>> plain old single-track CBS segmentation, paired PSCBS segmentation and 
>>>>> also non-paired PSCBS segmentation.  As Pierre says, if you have tumor 
>>>>> SNP data, you should look into doing parent-specific CN analysis, 
>>>>> which you can do either via paired or non-paired PSCBS depending on 
>>>>> whether you have match normals or not. 
>>>>>
>>>>> To take your allele-specific CRMAv2 and bring it into a format 
>>>>> recognized by the PSCBS package, see 
>>>>> http://aroma-project.org/vignettes/PairedPSCBS-lowlevel 
>>>>>
>>>>> /Henrik 
>>>>>
>>>>> > 
>>>>> > Best, 
>>>>> > 
>>>>> > Pierre 
>>>>> > 
>>>>> > 
>>>>> > On Wed, Dec 4, 2013 at 5:29 PM, Emilie <emilie....@gmail.com> 
>>>>> wrote: 
>>>>> >> 
>>>>> >> Hi Pierre, 
>>>>> >> 
>>>>> >> Thanks for your answer. I may be wrong but I thought smoothing 
>>>>> prior to 
>>>>> >> segmentation was somewhat common. It is shown in the vignettes for 
>>>>> DNACopy 
>>>>> >> and seems to be fairly common in the literature (this approach was 
>>>>> used in 
>>>>> >> the Metabric paper for example, 
>>>>> >> http://www.ncbi.nlm.nih.gov/pubmed/22522925). 
>>>>> >> 
>>>>> >> I'd be interested in hearing more of your thoughts against this. Do 
>>>>> you 
>>>>> >> have an idea of how much resolution is lost by smoothing? 
>>>>> >> 
>>>>> >> Emilie 
>>>>> >> 
>>>>> >> 
>>>>> >> 
>>>>> >> On Tuesday, December 3, 2013 5:26:38 PM UTC-5, Pierre Neuvial 
>>>>> wrote: 
>>>>> >>> 
>>>>> >>> Hi Emilie, 
>>>>> >>> 
>>>>> >>> It's certainly possible to do this within the Aroma framework 
>>>>> (e.g. using 
>>>>> >>> the function "binnedSmoothing").  It's probably not as 
>>>>> straightforward as 
>>>>> >>> running the segmentation directly, though, because this is not a 
>>>>> typical use 
>>>>> >>> case. 
>>>>> >>> 
>>>>> >>> In fact, I'm not sure why you want to perform smoothing before 
>>>>> >>> segmentation ?  Smoothing is definitely not required before 
>>>>> segmentation, 
>>>>> >>> and I would actually discourage to go this path because it will 
>>>>> end up in a 
>>>>> >>> loss of resolution along the genome at the smoothing step. 
>>>>> >>> 
>>>>> >>> Best, 
>>>>> >>> 
>>>>> >>> Pierre 
>>>>> >>> 
>>>>> >>> 
>>>>> >>> On Tue, Dec 3, 2013 at 8:53 PM, Emilie <emilie....@gmail.com> 
>>>>> wrote: 
>>>>> >>>> 
>>>>> >>>> Hi there, 
>>>>> >>>> 
>>>>> >>>> I'm new to processing Affy SNP6 chips and so am mainly 
>>>>> experimenting 
>>>>> >>>> with different methods to date. I ran CRMAv2 and followed steps 
>>>>> 1-4 from the 
>>>>> >>>> vignette (http://aroma-project.org/vignettes/CRMAv2). For step 
>>>>> 5, I want to 
>>>>> >>>> do a paired analysis. 
>>>>> >>>> 
>>>>> >>>> Previously I've used DNAcopy to perform CBS for other array 
>>>>> types, and 
>>>>> >>>> would like to follow a similar procedure, which includes 
>>>>> smoothing prior to 
>>>>> >>>> segmentation. Is this possible using the aroma.affymetrix 
>>>>> package? So far 
>>>>> >>>> I've followed the vignette for paired CNA analysis 
>>>>> >>>> (http://aroma-project.org/vignettes/pairedTotalCopyNumberAnalysis) 
>>>>> but 
>>>>> >>>> haven't seen any options for smoothing. 
>>>>> >>>> 
>>>>> >>>> thank you very much, 
>>>>> >>>> 
>>>>> >>>> emilie 
>>>>> >>>> 
>>>>> >>>> -- 
>>>>> >>>> -- 
>>>>> >>>> When reporting problems on aroma.affymetrix, make sure 1) to run 
>>>>> the 
>>>>> >>>> latest version of the package, 2) to report the output of 
>>>>> sessionInfo() and 
>>>>> >>>> traceback(), and 3) to post a complete code example. 
>>>>> >>>> 
>>>>> >>>> 
>>>>> >>>> You received this message because you are subscribed to the 
>>>>> Google 
>>>>> >>>> Groups "aroma.affymetrix" group with website 
>>>>> http://www.aroma-project.org/. 
>>>>> >>>> To post to this group, send email to aroma-af...@googlegroups.com 
>>>>> >>>> 
>>>>> >>>> To unsubscribe and other options, go to 
>>>>> >>>> http://www.aroma-project.org/forum/ 
>>>>> >>>> 
>>>>> >>>> --- 
>>>>> >>>> You received this message because you are subscribed to the 
>>>>> Google 
>>>>> >>>> Groups "aroma.affymetrix" group. 
>>>>> >>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>> send 
>>>>> >>>> an email to aroma-affymetr...@googlegroups.com. 
>>>>> >>>> 
>>>>> >>>> For more options, visit https://groups.google.com/groups/opt_out. 
>>>>>
>>>>> >>> 
>>>>> >>> 
>>>>> >> -- 
>>>>> >> -- 
>>>>> >> When reporting problems on aroma.affymetrix, make sure 1) to run 
>>>>> the 
>>>>> >> latest version of the package, 2) to report the output of 
>>>>> sessionInfo() and 
>>>>> >> traceback(), and 3) to post a complete code example. 
>>>>> >> 
>>>>> >> 
>>>>> >> You received this message because you are subscribed to the Google 
>>>>> Groups 
>>>>> >> "aroma.affymetrix" group with website http://www.aroma-project.org/. 
>>>>>
>>>>> >> To post to this group, send email to aroma-af...@googlegroups.com 
>>>>> >> To unsubscribe and other options, go to 
>>>>> >> http://www.aroma-project.org/forum/ 
>>>>> >> 
>>>>> >> --- 
>>>>> >> You received this message because you are subscribed to the Google 
>>>>> Groups 
>>>>> >> "aroma.affymetrix" group. 
>>>>> >> To unsubscribe from this group and stop receiving emails from it, 
>>>>> send an 
>>>>> >> email to aroma-affymetr...@googlegroups.com. 
>>>>> >> For more options, visit https://groups.google.com/groups/opt_out. 
>>>>> > 
>>>>> > 
>>>>> > -- 
>>>>> > -- 
>>>>> > When reporting problems on aroma.affymetrix, make sure 1) to run the 
>>>>> latest 
>>>>> > version of the package, 2) to report the output of sessionInfo() and 
>>>>> > traceback(), and 3) to post a complete code example. 
>>>>> > 
>>>>> > 
>>>>> > You received this message because you are subscribed to the Google 
>>>>> Groups 
>>>>> > "aroma.affymetrix" group with website http://www.aroma-project.org/. 
>>>>>
>>>>> > To post to this group, send email to aroma-af...@googlegroups.com 
>>>>> > To unsubscribe and other options, go to 
>>>>> http://www.aroma-project.org/forum/ 
>>>>> > 
>>>>> > --- 
>>>>> > You received this message because you are subscribed to the Google 
>>>>> Groups 
>>>>> > "aroma.affymetrix" group. 
>>>>> > To unsubscribe from this group and stop receiving emails from it, 
>>>>> send an 
>>>>> > email to aroma-affymetr...@googlegroups.com. 
>>>>> > For more options, visit https://groups.google.com/groups/opt_out. 
>>>>>
>>>>  -- 
>>>> -- 
>>>> When reporting problems on aroma.affymetrix, make sure 1) to run the 
>>>> latest version of the package, 2) to report the output of sessionInfo() 
>>>> and 
>>>> traceback(), and 3) to post a complete code example.
>>>>  
>>>>  
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "aroma.affymetrix" group with website 
>>>> http://www.aroma-project.org/.
>>>> To post to this group, send email to aroma-af...@googlegroups.com
>>>> To unsubscribe and other options, go to http://www.aroma-project.org/
>>>> forum/
>>>>  
>>>> --- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "aroma.affymetrix" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to aroma-affymetr...@googlegroups.com.
>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>
>>>  -- 
>> -- 
>> When reporting problems on aroma.affymetrix, make sure 1) to run the 
>> latest version of the package, 2) to report the output of sessionInfo() and 
>> traceback(), and 3) to post a complete code example.
>>  
>>  
>> You received this message because you are subscribed to the Google Groups 
>> "aroma.affymetrix" group with website http://www.aroma-project.org/.
>> To post to this group, send email to 
>> aroma-af...@googlegroups.com<javascript:>
>> To unsubscribe and other options, go to 
>> http://www.aroma-project.org/forum/
>>  
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "aroma.affymetrix" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to aroma-affymetr...@googlegroups.com <javascript:>.
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
>

-- 
-- 
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group with website http://www.aroma-project.org/.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe and other options, go to http://www.aroma-project.org/forum/

--- 
You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to aroma-affymetrix+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to