Thank you very much for the information it was very helpful. I have now made a comparison of tumor sample with a normal sample and fitted a segmentation model, I have used CBS and GLAD.

The output of these models gives me the mean raw CN level for the region and the number of loci in the region. Is the "mean" value given by the model the mean log2ratio of the copy number estimate? Many thanks for your help Ajanthah On 04/08/2010 19:05, "Pierre Neuvial" <pie...@stat.berkeley.edu> wrote: > Hi, > > On Tue, Aug 3, 2010 at 8:12 AM, Ajanthah Sangaralingam > <a.sangaralin...@qmul.ac.uk> wrote: >> Thank you for the reply. I actually need to get the log2 copy number ratios >> form the raw .cel files of a GenomeWideSNP6.0 array - I was using CRMA1 but >> am now repeating the analysis using CRMA v2. >> I am putting all of the different tumour types either matched or unmatched >> with a germline sample in the same directory and all the normal samples in >> another directory. > > This is fine, but note that did not need to put them in separate directories. > >> I will then go through the processes of qulaity assesment, calibration >> crosstalk, normalization for probe sequence effect, probe summarization, and >> normalization of PCR fragment length effects. >> >> Do I need to calculate the raw copy numbers and turn these into log2 copy >> numbers? > > I don't understand your question. > >> How would I then calculate the copy numbers for >> 1. Unpaired tumour samples - will need to be compared to a pooled reference >> from a particular tumpur type >> 2. Paired samples? > > > It's hard to be more specific than what Henrik already said without > more details on your sample names and tumor types, and most > importantly on the design of your study. > > I'll try for the unpaired analysis (your 1.) > > I am assuming that you have two data sets: > - 'dsT' for the tumor samples, > - 'dsN' for the normal samples. > > It seems that your concern is to use *tumor-type specific* sets of > normal samples. Is that correct ? See my remark below on the fact > that I'm not sure it's what you should do. > > If so, then assuming that 'idxT1' contains the indices of all tumor > samples from a particular tumor type in dsT, and 'idxN1' contains the > indices of normal samples from the same tumor type in dsN, you can do > > dsN1 <- extract(dsN, idxN1); ## normal samples of tumor type 1 > dsT1 <- extract(dsT, idxT1); ## tumor samples of tumor type 1 > > dfR1 <- getAverageFile(dsN1); ## pool of normal samples of tumor type 1 > sm1 <- CbsModel(dsT1, dfR1); > > Then you can do > > fit(sm1, chromosome=1, array=1, verbose=log); > > to perform CBS segmentation and/or > > rawCNs1 <- extractRawCopyNumbers(sm1, array=1, chromosome=1) > plot(rawCns1) > > to extract and plot raw copy numbers (independently of CBS). > > And so on for each tumor type. > > This should answer your 1. However, I'm not sure that using > tumor-type specific sets of normal samples will give you better > results. This depends in particular on the following specific points > in your design: > - Are you "normals" normal tissue samples blood samples ? > - Were all the tumor and normal microarrays done in the same lab, and > approximately at the same time ? If so, combining all the normals > could be better. > One way to know which option is best (tumor-specific reference or > global reference) is to try both and compare the segmentation results > (e.g. using ChromosomeExplorer). > > For your 2 (paired tumor/normal analysis), I think Henrik gave all the > necessary information already, but > > assuming that 'idxT2' contains the indices of all tumor samples from a > particular tumor type in dsT that have a paired normal, and 'idxN2' > contains the indices of these paired normal samples from the same > tumor type in dsN, further assuming that *the samples are in the same > order in the two sets of indices*, you can do > > dsN2 <- extract(dsN, idxN2); > dsT2 <- extract(dsT, idxT2); > > sm2 <- CbsModel(dsT2, dsN2); > > fit(sm2, chromosome=1, array=1, verbose=log); > rawCNs2 <- extractRawCopyNumbers(sm2, array=1, chromosome=1) > plot(rawCNs2); > > I hope this helps, > > Pierre. > >> >> Many thanks for your help >> >> On 18/07/2010 12:01, "Ajanthah Sangaralingam" <a.sangaralin...@qmul.ac.uk> >> wrote: >> >>> Hi, >>> >>> Yes this is correct. >>> >>> Many thanks >>> >>> Ajanthah >>> ________________________________________ >>> From: aroma-affymetrix@googlegroups.com [aroma-affymet...@googlegroups.com] >>> On >>> Behalf Of Henrik Bengtsson [...@stat.berkeley.edu] >>> Sent: Sunday, July 18, 2010 11:28 AM >>> To: aroma-affymetrix >>> Subject: Re: [aroma.affymetrix] Analysis of GenomeWideSNP6.0 data >>> >>> Hi. >>> >>> On Fri, Jul 16, 2010 at 11:13 AM, Ajanthah Sangaralingam >>> <a.sangaralin...@qmul.ac.uk> wrote: >>>> Hi, >>>> >>>> I have been doing some paired total copy number analysis in aroma >>>> afyymetrix. >>>> The dataset I have is complicated for haf the dataset I have reference >>>> samples, for the other half I will do an unpiared analysis. >>> >>> So, to make sure I don't misunderstand, you have an Affymetrix >>> GenomeWideSNP_6 (GWS6) data set that contains tumors and for some, but >>> not all of the you have matched normal samples, where "matched normal" >>> mean a normal tissue or normal blood extract from the same patient as >>> the tumor was taken. Is this correct? >>> >>>> I alos have data from many different tomor types not just one - I do not >>>> have >>>> the sample number of samples from each type of tumor. >>>> >>>> My questions are: >>>> >>>> When doing a paired analysis - the normal and tumour data have there own >>>> directories and allelic cross talk calibration, summarization and PCR >>>> fragment length normlization is all done separately. >>> >>> It is important to know which preprocessing method you are following. >>> Since you are working with GWS6 arrays, I recommend that you use the >>> CRMAv2 preprocessing method as described in vignette 'Estimation of >>> total copy numbers using the CRMA v2 method (10K-GWS6)': >>> >>> http://aroma-project.org/vignettes/CRMAv2 >>> >>> Note the function doCRMAv2() which is convenient when you do not want >>> to dig into the details. >>> >>> Since you are not mentioning probe-sequence normalization, it looks >>> like you are indeed using CRMA v1. If so, I recommend that you use >>> CRMA v2 instead. Using CRMA v2 will be really useful for you, as >>> explained below. >>> >>>> Is this tue for the different tumor types as well - should they be treated >>>> separately for all of tehse stages or can all the tumor types be put into >>>> one >>>> tumour directory. >>> >>> This is perfectly fine if you are using CRMA v2 (but not CRMA v1). >>> As now clarified in the vignette, in addition to the CRMAv2 paper, you >>> will get identical results with CRMAv2 regardless what other samples >>> you put in your data set; the CRMAv2 method is truly a single-array >>> method. It is only when you get to the step where calculate copy >>> numbers relative to a pool of references you have to make a decision >>> on what pool of reference samples you'll use. >>> >>>> Also, I am unable to extarct the reference samples that I want after >>>> normaization to compare to the matching sanmples say in another tumor type. >>>> Segmentation models cannot be fit unless the number of samples match >>>> exactly. >>> >>> It actually can, as explained below. >>> >>>> >>>> Does this mean that I need to do all the stages again for the subsets of >>>> reference samples that have matching pairs in the other tumor types? >>> >>> The segmentation models, for instance CbsModel, segments each tumor >>> either (a) to a matched normal, or (b) to a global reference. When >>> you do (a), by definition there has to be an equal number of tumors as >>> matched normals, whereas when you do (b), there can only be one >>> reference sample specified. >>> >>> Example of paired tumor-normal segmentation: >>> >>> # A set of tumor samples >>> dsT <- ... >>> # A set of matched normal samples ordered such that they >>> # match the ordering in the tumor data set 'dsT'. >>> dsN <- ... >>> sm <- CbsModel(dsT, dsN); >>> >>> Example of tumor-global reference segmentation: >>> >>> # A set of tumor samples >>> dsT <- ... >>> # A set of reference samples (can be normals or everything) >>> dsR <- ... >>> # Use the pool of all reference samples as the reference >>> dfR <- getAverageFile(dsR); >>> sm <- CbsModel(dsT, dfR); >>> >>> Note that 'dfR' is a single "virtual" array, not a data set. >>> >>> Did that above make sense? >>> >>> /Henrik >>> >>>> >>>> Many thanks >>>> >>>> Ajanthah >>>> >>>> >>>> >>>> >>>> This email may contain information that is privileged, confidential or >>>> otherwise protected from disclosure. >>>> It must not be used by, or its contents copied or disclosed to, persons >>>> other >>>> than the addressee. >>>> If you have received this email in error please notify the sender >>>> immediately >>>> and delete the email. >>>> This message has been scanned for viruses. >>>> >>>> -- >>>> When reporting problems on aroma.affymetrix, make sure 1) to run the latest >>>> version of the package, 2) to report the output of sessionInfo() and >>>> traceback(), and 3) to post a complete code example. >>>> >>>> >>>> You received this message because you are subscribed to the Google Groups >>>> "aroma.affymetrix" group with website http://www.aroma-project.org/. >>>> To post to this group, send email to aroma-affymetrix@googlegroups.com >>>> To unsubscribe and other options, go to http://www.aroma-project.org/forum/ >>>> >>> >>> -- >>> When reporting problems on aroma.affymetrix, make sure 1) to run the latest >>> version of the package, 2) to report the output of sessionInfo() and >>> traceback(), and 3) to post a complete code example. >>> >>> >>> You received this message because you are subscribed to the Google Groups >>> "aroma.affymetrix" group with website http://www.aroma-project.org/. >>> To post to this group, send email to aroma-affymetrix@googlegroups.com >>> To unsubscribe and other options, go to http://www.aroma-project.org/forum/ >> >> -- >> When reporting problems on aroma.affymetrix, make sure 1) to run the latest >> version of the package, 2) to report the output of sessionInfo() and >> traceback(), and 3) to post a complete code example. >> >> >> You received this message because you are subscribed to the Google Groups >> "aroma.affymetrix" group with website http://www.aroma-project.org/. >> To post to this group, send email to aroma-affymetrix@googlegroups.com >> To unsubscribe and other options, go to http://www.aroma-project.org/forum/ >> > > -- > When reporting problems on aroma.affymetrix, make sure 1) to run the latest > version of the package, 2) to report the output of sessionInfo() and > traceback(), and 3) to post a complete code example. > > > You received this message because you are subscribed to the Google Groups > "aroma.affymetrix" group with website http://www.aroma-project.org/. > To post to this group, send email to aroma-affymetrix@googlegroups.com > To unsubscribe and other options, go to http://www.aroma-project.org/forum/ -- When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example. You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group with website http://www.aroma-project.org/. To post to this group, send email to aroma-affymetrix@googlegroups.com To unsubscribe and other options, go to http://www.aroma-project.org/forum/