### Re: [aroma.affymetrix] Raw copy numbers calculation using common reference instead of average of all samples

I see. Thanks a lot, Sean On Tue, Aug 6, 2013 at 8:05 PM, Henrik Bengtsson < henrik.bengts...@aroma-project.org> wrote: > Hi. > > On Tue, Aug 6, 2013 at 12:31 PM, ying chen wrote: > > Hi Henrik, > > > > > > Sorry I have one more question to bug you. The urls in the df (by df <- > > readDataFrame(db);) point to UCSCgenome browser with NCBI36/hg18 > assembly. I > > looked at the http://www.aroma-project.org/chipTypes/GenomeWideSNP_6page > > resource part, the SNP6 chip type files are listed as based on hg19 > (Human > > Genome build v36.1). This is a typo, right? I thought it should be hg18. > So > > the Cbs out put is also based on hg18, right? > > The genome version ("hg18") in the URLs to the USCC Genome Browser are > (still) hard coded. Please see: > > > https://groups.google.com/d/msg/aroma-affymetrix/r8CtCvdGWnE/jzQFMifxt2MJ > > for more informations/details. > > /Henrik > > > > > Thanks a lot for the help! > > > > Sean > > > > > > > > > > > > > > > > On Fri, Aug 2, 2013 at 11:03 AM, Henrik Bengtsson > > wrote: > >> > >> On Thu, Aug 1, 2013 at 6:37 PM, ying chen wrote: > >> > Hi Henrik, > >> > > >> > Thanks a lot for the help. > >> > > >> > For sm <- CbsModel(dsTR, ref=dfR, tags="HapMapRef"), you mean sm <- > >> > CbsModel(dsT, ref=dfR, tags="HapMapRef"), right? > >> > >> Correct, I meant: > >> > >> sm <- CbsModel(dsT, ref=dfR, tags="HapMapRef") > >> > >> /H > >> > >> > > >> > Thanks, > >> > > >> > Sean > >> > > >> > > >> > > >> > On Thu, Aug 1, 2013 at 6:00 PM, Henrik Bengtsson > >> > wrote: > >> >> > >> >> On Mon, Jul 29, 2013 at 7:45 PM, ying chen wrote: > >> >> > Hi Henrik, > >> >> > Thanks a lot for the help! > >> >> > Sorry I have more questions. I am following "How to: Calculate > total > >> >> > copy > >> >> > number ratios from total (non-polymorphic) signals" and "Vignette: > >> >> > Total > >> >> > copy-number segmentation (non-paired CBS)", but I am not sure if I > do > >> >> > it > >> >> > correctly. > >> >> > I have two SNP6 datasets Tumor and HapMap270 and I want to use > >> >> > HpaMap270 > >> >> > as > >> >> > reference to go all the way to CBS step. So I do the following > steps > >> >> > respectively. > >> >> > > ds1 <- doCRMAv2("HapMap270", chipType="GenomeWideSNP_6,Full") > >> >> > > ds2 <- doCRMAv2("Tumor", chipType="GenomeWideSNP_6,Full") > >> >> > After that, I do > >> >> > > dataSet <- "HapMap270" > >> >> > > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY" > >> >> > > chipType <- "GenomeWideSNP_6" > >> >> > > dsN <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, > >> >> > chipType=chipType) > >> >> > > dataSet <- "Tumor" > >> >> > > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY" > >> >> > > chipType <- "GenomeWideSNP_6" > >> >> > > dsT <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, > >> >> > chipType=chipType) > >> >> > > dfR <- getAverageFile(dsN) # ? > >> >> > > dsTR <- exportTotalCnRatioSet(dsT, ref=dfR) # ? > >> >> > Would the above two steps work? My question is how to go ahead from > >> >> > here > >> >> > to > >> >> > do CBS and will sm <- CbsModel(dsTR) work? > >> >> > >> >> Skip the exportTotalCnRatioSet() call, and instead use: > >> >> > >> >> sm <- CbsModel(dsTR, ref=dfR, tags="HapMapRef"); > >> >> > >> >> The 'tags' is just to add an informative tag to the output data set. > >> >> > >> >> /Henrik > >> >> > >> >> > Thanks again for your help! > >> >> > Sean > >> >> > > >> >> > > >> >> > > >> >> > > >> >> > On Sun, Jul 28, 2013 at 4:28 PM, Henrik Bengtsson > >> >> > wrote: > >> >> >> > >> >> >> Hi. > >> >> >> > >> >> >> On Fri, Jul 26, 2013 at 8:02 AM, sean nj > wrote: > >> >> >> > Hi guys, > >> >> >> > > >> >> >> > I have a question regarding how to calculate raw copy numbers > >> >> >> > using > >> >> >> > common > >> >> >> > reference instead of average of all samples of the study. > >> >> >> > Basically I > >> >> >> > want > >> >> >> > to use average of HapMap270 samples as reference for all further > >> >> >> > copy > >> >> >> > number > >> >> >> > calculations. > >> >> >> > > >> >> >> > I have a bunch HapMap270 snp6 cel files and I followed Vignette: > >> >> >> > Estimation > >> >> >> > of total copy numbers using the CRMA v2 method (10K-CytoScanHD) > to > >> >> >> > Step > >> >> >> > 5 - > >> >> >> > Calculation of raw copy numbers, and generated ceR and saved it > as > >> >> >> > a > >> >> >> > RData > >> >> >> > file ceR.Rdata. > >> >> >> > >> >> >> It's important to understand that almost all objects in the Aroma > >> >> >> framework are basically "pointers" to external files. For > instance, > >> >> >> your 'ceR', which I assume you've got from something like ceR <- > >> >> >> getAverageFile(ces), is referring to the file with pathname > >> >> >> getPathname(ceR). More below... > >> >> >> > >> >> >> > > >> >> >> > My first question is, how to use this data for any future copy > >> >> >> > number > >> >> >> > analysis? My guess is that instead of calculating the ceR from > the > >> >> >> > sample > >> >> >> > set I can

### Re: [aroma.affymetrix] Raw copy numbers calculation using common reference instead of average of all samples

Hi. On Tue, Aug 6, 2013 at 12:31 PM, ying chen wrote: > Hi Henrik, > > > Sorry I have one more question to bug you. The urls in the df (by df <- > readDataFrame(db);) point to UCSCgenome browser with NCBI36/hg18 assembly. I > looked at the http://www.aroma-project.org/chipTypes/GenomeWideSNP_6 page > resource part, the SNP6 chip type files are listed as based on hg19 (Human > Genome build v36.1). This is a typo, right? I thought it should be hg18. So > the Cbs out put is also based on hg18, right? The genome version ("hg18") in the URLs to the USCC Genome Browser are (still) hard coded. Please see: https://groups.google.com/d/msg/aroma-affymetrix/r8CtCvdGWnE/jzQFMifxt2MJ for more informations/details. /Henrik > > Thanks a lot for the help! > > Sean > > > > > > > > On Fri, Aug 2, 2013 at 11:03 AM, Henrik Bengtsson > wrote: >> >> On Thu, Aug 1, 2013 at 6:37 PM, ying chen wrote: >> > Hi Henrik, >> > >> > Thanks a lot for the help. >> > >> > For sm <- CbsModel(dsTR, ref=dfR, tags="HapMapRef"), you mean sm <- >> > CbsModel(dsT, ref=dfR, tags="HapMapRef"), right? >> >> Correct, I meant: >> >> sm <- CbsModel(dsT, ref=dfR, tags="HapMapRef") >> >> /H >> >> > >> > Thanks, >> > >> > Sean >> > >> > >> > >> > On Thu, Aug 1, 2013 at 6:00 PM, Henrik Bengtsson >> > wrote: >> >> >> >> On Mon, Jul 29, 2013 at 7:45 PM, ying chen wrote: >> >> > Hi Henrik, >> >> > Thanks a lot for the help! >> >> > Sorry I have more questions. I am following "How to: Calculate total >> >> > copy >> >> > number ratios from total (non-polymorphic) signals" and "Vignette: >> >> > Total >> >> > copy-number segmentation (non-paired CBS)", but I am not sure if I do >> >> > it >> >> > correctly. >> >> > I have two SNP6 datasets Tumor and HapMap270 and I want to use >> >> > HpaMap270 >> >> > as >> >> > reference to go all the way to CBS step. So I do the following steps >> >> > respectively. >> >> > > ds1 <- doCRMAv2("HapMap270", chipType="GenomeWideSNP_6,Full") >> >> > > ds2 <- doCRMAv2("Tumor", chipType="GenomeWideSNP_6,Full") >> >> > After that, I do >> >> > > dataSet <- "HapMap270" >> >> > > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY" >> >> > > chipType <- "GenomeWideSNP_6" >> >> > > dsN <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, >> >> > chipType=chipType) >> >> > > dataSet <- "Tumor" >> >> > > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY" >> >> > > chipType <- "GenomeWideSNP_6" >> >> > > dsT <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, >> >> > chipType=chipType) >> >> > > dfR <- getAverageFile(dsN) # ? >> >> > > dsTR <- exportTotalCnRatioSet(dsT, ref=dfR) # ? >> >> > Would the above two steps work? My question is how to go ahead from >> >> > here >> >> > to >> >> > do CBS and will sm <- CbsModel(dsTR) work? >> >> >> >> Skip the exportTotalCnRatioSet() call, and instead use: >> >> >> >> sm <- CbsModel(dsTR, ref=dfR, tags="HapMapRef"); >> >> >> >> The 'tags' is just to add an informative tag to the output data set. >> >> >> >> /Henrik >> >> >> >> > Thanks again for your help! >> >> > Sean >> >> > >> >> > >> >> > >> >> > >> >> > On Sun, Jul 28, 2013 at 4:28 PM, Henrik Bengtsson >> >> > wrote: >> >> >> >> >> >> Hi. >> >> >> >> >> >> On Fri, Jul 26, 2013 at 8:02 AM, sean nj wrote: >> >> >> > Hi guys, >> >> >> > >> >> >> > I have a question regarding how to calculate raw copy numbers >> >> >> > using >> >> >> > common >> >> >> > reference instead of average of all samples of the study. >> >> >> > Basically I >> >> >> > want >> >> >> > to use average of HapMap270 samples as reference for all further >> >> >> > copy >> >> >> > number >> >> >> > calculations. >> >> >> > >> >> >> > I have a bunch HapMap270 snp6 cel files and I followed Vignette: >> >> >> > Estimation >> >> >> > of total copy numbers using the CRMA v2 method (10K-CytoScanHD) to >> >> >> > Step >> >> >> > 5 - >> >> >> > Calculation of raw copy numbers, and generated ceR and saved it as >> >> >> > a >> >> >> > RData >> >> >> > file ceR.Rdata. >> >> >> >> >> >> It's important to understand that almost all objects in the Aroma >> >> >> framework are basically "pointers" to external files. For instance, >> >> >> your 'ceR', which I assume you've got from something like ceR <- >> >> >> getAverageFile(ces), is referring to the file with pathname >> >> >> getPathname(ceR). More below... >> >> >> >> >> >> > >> >> >> > My first question is, how to use this data for any future copy >> >> >> > number >> >> >> > analysis? My guess is that instead of calculating the ceR from the >> >> >> > sample >> >> >> > set I can just load the ceR.RData file I saved and use it. Right? >> >> >> >> >> >> First of all, please note that when do ceR <- getAverageFile(ces) on >> >> >> the same data set 'ces', the result is already available on file and >> >> >> it will be quickly found and returned. In other words, it will not >> >> >> recalcuate the averages again [unless you do ceR <- >> >> >> getAverageFile(ces, force=TRUE)]. >> >> >> >> >> >> However, I do unde

### Re: [aroma.affymetrix] Raw copy numbers calculation using common reference instead of average of all samples

Hi Henrik, Sorry I have one more question to bug you. The urls in the df (by df <- readDataFrame(db);) point to UCSCgenome browser with NCBI36/hg18 assembly. I looked at the http://www.aroma-project.org/chipTypes/GenomeWideSNP_6 page resource part, the SNP6 chip type files are listed as based on hg19 (Human Genome build v36.1). This is a typo, right? I thought it should be hg18. So the Cbs out put is also based on hg18, right? Thanks a lot for the help! Sean On Fri, Aug 2, 2013 at 11:03 AM, Henrik Bengtsson < henrik.bengts...@aroma-project.org> wrote: > On Thu, Aug 1, 2013 at 6:37 PM, ying chen wrote: > > Hi Henrik, > > > > Thanks a lot for the help. > > > > For sm <- CbsModel(dsTR, ref=dfR, tags="HapMapRef"), you mean sm <- > > CbsModel(dsT, ref=dfR, tags="HapMapRef"), right? > > Correct, I meant: > > sm <- CbsModel(dsT, ref=dfR, tags="HapMapRef") > > /H > > > > > Thanks, > > > > Sean > > > > > > > > On Thu, Aug 1, 2013 at 6:00 PM, Henrik Bengtsson > > wrote: > >> > >> On Mon, Jul 29, 2013 at 7:45 PM, ying chen wrote: > >> > Hi Henrik, > >> > Thanks a lot for the help! > >> > Sorry I have more questions. I am following "How to: Calculate total > >> > copy > >> > number ratios from total (non-polymorphic) signals" and "Vignette: > Total > >> > copy-number segmentation (non-paired CBS)", but I am not sure if I do > it > >> > correctly. > >> > I have two SNP6 datasets Tumor and HapMap270 and I want to use > HpaMap270 > >> > as > >> > reference to go all the way to CBS step. So I do the following steps > >> > respectively. > >> > > ds1 <- doCRMAv2("HapMap270", chipType="GenomeWideSNP_6,Full") > >> > > ds2 <- doCRMAv2("Tumor", chipType="GenomeWideSNP_6,Full") > >> > After that, I do > >> > > dataSet <- "HapMap270" > >> > > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY" > >> > > chipType <- "GenomeWideSNP_6" > >> > > dsN <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, > >> > chipType=chipType) > >> > > dataSet <- "Tumor" > >> > > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY" > >> > > chipType <- "GenomeWideSNP_6" > >> > > dsT <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, > >> > chipType=chipType) > >> > > dfR <- getAverageFile(dsN) # ? > >> > > dsTR <- exportTotalCnRatioSet(dsT, ref=dfR) # ? > >> > Would the above two steps work? My question is how to go ahead from > here > >> > to > >> > do CBS and will sm <- CbsModel(dsTR) work? > >> > >> Skip the exportTotalCnRatioSet() call, and instead use: > >> > >> sm <- CbsModel(dsTR, ref=dfR, tags="HapMapRef"); > >> > >> The 'tags' is just to add an informative tag to the output data set. > >> > >> /Henrik > >> > >> > Thanks again for your help! > >> > Sean > >> > > >> > > >> > > >> > > >> > On Sun, Jul 28, 2013 at 4:28 PM, Henrik Bengtsson > >> > wrote: > >> >> > >> >> Hi. > >> >> > >> >> On Fri, Jul 26, 2013 at 8:02 AM, sean nj wrote: > >> >> > Hi guys, > >> >> > > >> >> > I have a question regarding how to calculate raw copy numbers using > >> >> > common > >> >> > reference instead of average of all samples of the study. > Basically I > >> >> > want > >> >> > to use average of HapMap270 samples as reference for all further > copy > >> >> > number > >> >> > calculations. > >> >> > > >> >> > I have a bunch HapMap270 snp6 cel files and I followed Vignette: > >> >> > Estimation > >> >> > of total copy numbers using the CRMA v2 method (10K-CytoScanHD) to > >> >> > Step > >> >> > 5 - > >> >> > Calculation of raw copy numbers, and generated ceR and saved it as > a > >> >> > RData > >> >> > file ceR.Rdata. > >> >> > >> >> It's important to understand that almost all objects in the Aroma > >> >> framework are basically "pointers" to external files. For instance, > >> >> your 'ceR', which I assume you've got from something like ceR <- > >> >> getAverageFile(ces), is referring to the file with pathname > >> >> getPathname(ceR). More below... > >> >> > >> >> > > >> >> > My first question is, how to use this data for any future copy > number > >> >> > analysis? My guess is that instead of calculating the ceR from the > >> >> > sample > >> >> > set I can just load the ceR.RData file I saved and use it. Right? > >> >> > >> >> First of all, please note that when do ceR <- getAverageFile(ces) on > >> >> the same data set 'ces', the result is already available on file and > >> >> it will be quickly found and returned. In other words, it will not > >> >> recalcuate the averages again [unless you do ceR <- > >> >> getAverageFile(ces, force=TRUE)]. > >> >> > >> >> However, I do understand that you may not want to have to keep a > large > >> >> 'ces' data set around, when you're only interested in the pooled > >> >> average. In that case, I would copy the file containing the > "average" > >> >> to a new data set. Currently, this is not straightforward in Aroma > >> >> (I'll think about something), but you can do the following: > >> >> > >> >> # Calculate the pooled average > >> >> > ceR <- getAverageFile(cesN); > >> >> > >> >> # Copy t

On Thu, Aug 1, 2013 at 6:37 PM, ying chen wrote: > Hi Henrik, > > Thanks a lot for the help. > > For sm <- CbsModel(dsTR, ref=dfR, tags="HapMapRef"), you mean sm <- > CbsModel(dsT, ref=dfR, tags="HapMapRef"), right? Correct, I meant: sm <- CbsModel(dsT, ref=dfR, tags="HapMapRef") /H > > Thanks, > > Sean > > > > On Thu, Aug 1, 2013 at 6:00 PM, Henrik Bengtsson > wrote: >> >> On Mon, Jul 29, 2013 at 7:45 PM, ying chen wrote: >> > Hi Henrik, >> > Thanks a lot for the help! >> > Sorry I have more questions. I am following "How to: Calculate total >> > copy >> > number ratios from total (non-polymorphic) signals" and "Vignette: Total >> > copy-number segmentation (non-paired CBS)", but I am not sure if I do it >> > correctly. >> > I have two SNP6 datasets Tumor and HapMap270 and I want to use HpaMap270 >> > as >> > reference to go all the way to CBS step. So I do the following steps >> > respectively. >> > > ds1 <- doCRMAv2("HapMap270", chipType="GenomeWideSNP_6,Full") >> > > ds2 <- doCRMAv2("Tumor", chipType="GenomeWideSNP_6,Full") >> > After that, I do >> > > dataSet <- "HapMap270" >> > > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY" >> > > chipType <- "GenomeWideSNP_6" >> > > dsN <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, >> > chipType=chipType) >> > > dataSet <- "Tumor" >> > > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY" >> > > chipType <- "GenomeWideSNP_6" >> > > dsT <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, >> > chipType=chipType) >> > > dfR <- getAverageFile(dsN) # ? >> > > dsTR <- exportTotalCnRatioSet(dsT, ref=dfR) # ? >> > Would the above two steps work? My question is how to go ahead from here >> > to >> > do CBS and will sm <- CbsModel(dsTR) work? >> >> Skip the exportTotalCnRatioSet() call, and instead use: >> >> sm <- CbsModel(dsTR, ref=dfR, tags="HapMapRef"); >> >> The 'tags' is just to add an informative tag to the output data set. >> >> /Henrik >> >> > Thanks again for your help! >> > Sean >> > >> > >> > >> > >> > On Sun, Jul 28, 2013 at 4:28 PM, Henrik Bengtsson >> > wrote: >> >> >> >> Hi. >> >> >> >> On Fri, Jul 26, 2013 at 8:02 AM, sean nj wrote: >> >> > Hi guys, >> >> > >> >> > I have a question regarding how to calculate raw copy numbers using >> >> > common >> >> > reference instead of average of all samples of the study. Basically I >> >> > want >> >> > to use average of HapMap270 samples as reference for all further copy >> >> > number >> >> > calculations. >> >> > >> >> > I have a bunch HapMap270 snp6 cel files and I followed Vignette: >> >> > Estimation >> >> > of total copy numbers using the CRMA v2 method (10K-CytoScanHD) to >> >> > Step >> >> > 5 - >> >> > Calculation of raw copy numbers, and generated ceR and saved it as a >> >> > RData >> >> > file ceR.Rdata. >> >> >> >> It's important to understand that almost all objects in the Aroma >> >> framework are basically "pointers" to external files. For instance, >> >> your 'ceR', which I assume you've got from something like ceR <- >> >> getAverageFile(ces), is referring to the file with pathname >> >> getPathname(ceR). More below... >> >> >> >> > >> >> > My first question is, how to use this data for any future copy number >> >> > analysis? My guess is that instead of calculating the ceR from the >> >> > sample >> >> > set I can just load the ceR.RData file I saved and use it. Right? >> >> >> >> First of all, please note that when do ceR <- getAverageFile(ces) on >> >> the same data set 'ces', the result is already available on file and >> >> it will be quickly found and returned. In other words, it will not >> >> recalcuate the averages again [unless you do ceR <- >> >> getAverageFile(ces, force=TRUE)]. >> >> >> >> However, I do understand that you may not want to have to keep a large >> >> 'ces' data set around, when you're only interested in the pooled >> >> average. In that case, I would copy the file containing the "average" >> >> to a new data set. Currently, this is not straightforward in Aroma >> >> (I'll think about something), but you can do the following: >> >> >> >> # Calculate the pooled average >> >> > ceR <- getAverageFile(cesN); >> >> >> >> # Copy this file to plmData/HapMap270,pooled/GenomeWideSNP_6/, e.g. >> >> > filename <- getFilename(ceR); >> >> > filename >> >> [1] >> >> ".average-intensities-median-mad,d03faaf8b707a97c4e43381b1a5d1ef2.CEL" >> >> > rootPath <- getParent(getPath(cesN), depth=2L); >> >> > dataSet <- "HapMap270,pooled"; >> >> > chipType <- getChipType(ceR, fullname=FALSE); >> >> > path <- file.path(rootPath, dataSet, chipType); >> >> > path >> >> [1] "plmData/HapMap270,pooled/GenomeWideSNP_6" >> >> > mkdirs(path); >> >> > copyFile(getPathname(ceR), file.path(path, filename)); >> >> >> >> With this done, you can then grab this pooled reference as: >> >> >> >> > library("aroma.affymetrix") >> >> > path <- "plmData/HapMap270,pooled/GenomeWideSNP_6"; >> >> > filename <- >> >> > >> >> > ".average-intensities-median-mad,d03faaf8b707a97c4e43381b1a5d

Hi Henrik, Thanks a lot for the help. For sm <- CbsModel(dsTR, ref=dfR, tags="HapMapRef"), you mean sm <- CbsModel(dsT, ref=dfR, tags="HapMapRef"), right? Thanks, Sean On Thu, Aug 1, 2013 at 6:00 PM, Henrik Bengtsson < henrik.bengts...@aroma-project.org> wrote: > On Mon, Jul 29, 2013 at 7:45 PM, ying chen wrote: > > Hi Henrik, > > Thanks a lot for the help! > > Sorry I have more questions. I am following "How to: Calculate total copy > > number ratios from total (non-polymorphic) signals" and "Vignette: Total > > copy-number segmentation (non-paired CBS)", but I am not sure if I do it > > correctly. > > I have two SNP6 datasets Tumor and HapMap270 and I want to use HpaMap270 > as > > reference to go all the way to CBS step. So I do the following steps > > respectively. > > > ds1 <- doCRMAv2("HapMap270", chipType="GenomeWideSNP_6,Full") > > > ds2 <- doCRMAv2("Tumor", chipType="GenomeWideSNP_6,Full") > > After that, I do > > > dataSet <- "HapMap270" > > > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY" > > > chipType <- "GenomeWideSNP_6" > > > dsN <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, > > chipType=chipType) > > > dataSet <- "Tumor" > > > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY" > > > chipType <- "GenomeWideSNP_6" > > > dsT <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, > > chipType=chipType) > > > dfR <- getAverageFile(dsN) # ? > > > dsTR <- exportTotalCnRatioSet(dsT, ref=dfR) # ? > > Would the above two steps work? My question is how to go ahead from here > to > > do CBS and will sm <- CbsModel(dsTR) work? > > Skip the exportTotalCnRatioSet() call, and instead use: > > sm <- CbsModel(dsTR, ref=dfR, tags="HapMapRef"); > > The 'tags' is just to add an informative tag to the output data set. > > /Henrik > > > Thanks again for your help! > > Sean > > > > > > > > > > On Sun, Jul 28, 2013 at 4:28 PM, Henrik Bengtsson > > wrote: > >> > >> Hi. > >> > >> On Fri, Jul 26, 2013 at 8:02 AM, sean nj wrote: > >> > Hi guys, > >> > > >> > I have a question regarding how to calculate raw copy numbers using > >> > common > >> > reference instead of average of all samples of the study. Basically I > >> > want > >> > to use average of HapMap270 samples as reference for all further copy > >> > number > >> > calculations. > >> > > >> > I have a bunch HapMap270 snp6 cel files and I followed Vignette: > >> > Estimation > >> > of total copy numbers using the CRMA v2 method (10K-CytoScanHD) to > Step > >> > 5 - > >> > Calculation of raw copy numbers, and generated ceR and saved it as a > >> > RData > >> > file ceR.Rdata. > >> > >> It's important to understand that almost all objects in the Aroma > >> framework are basically "pointers" to external files. For instance, > >> your 'ceR', which I assume you've got from something like ceR <- > >> getAverageFile(ces), is referring to the file with pathname > >> getPathname(ceR). More below... > >> > >> > > >> > My first question is, how to use this data for any future copy number > >> > analysis? My guess is that instead of calculating the ceR from the > >> > sample > >> > set I can just load the ceR.RData file I saved and use it. Right? > >> > >> First of all, please note that when do ceR <- getAverageFile(ces) on > >> the same data set 'ces', the result is already available on file and > >> it will be quickly found and returned. In other words, it will not > >> recalcuate the averages again [unless you do ceR <- > >> getAverageFile(ces, force=TRUE)]. > >> > >> However, I do understand that you may not want to have to keep a large > >> 'ces' data set around, when you're only interested in the pooled > >> average. In that case, I would copy the file containing the "average" > >> to a new data set. Currently, this is not straightforward in Aroma > >> (I'll think about something), but you can do the following: > >> > >> # Calculate the pooled average > >> > ceR <- getAverageFile(cesN); > >> > >> # Copy this file to plmData/HapMap270,pooled/GenomeWideSNP_6/, e.g. > >> > filename <- getFilename(ceR); > >> > filename > >> [1] > ".average-intensities-median-mad,d03faaf8b707a97c4e43381b1a5d1ef2.CEL" > >> > rootPath <- getParent(getPath(cesN), depth=2L); > >> > dataSet <- "HapMap270,pooled"; > >> > chipType <- getChipType(ceR, fullname=FALSE); > >> > path <- file.path(rootPath, dataSet, chipType); > >> > path > >> [1] "plmData/HapMap270,pooled/GenomeWideSNP_6" > >> > mkdirs(path); > >> > copyFile(getPathname(ceR), file.path(path, filename)); > >> > >> With this done, you can then grab this pooled reference as: > >> > >> > library("aroma.affymetrix") > >> > path <- "plmData/HapMap270,pooled/GenomeWideSNP_6"; > >> > filename <- > >> > > ".average-intensities-median-mad,d03faaf8b707a97c4e43381b1a5d1ef2.CEL"; > >> > ceR <- CnChipEffectFile(filename, path=path); > >> > >> Note, when you save 'ceR', you are basically saving the reference to > >> the file. Yes, you can load it later, but make sure not to move it, > >> otherwise you'll get some t

On Mon, Jul 29, 2013 at 7:45 PM, ying chen wrote: > Hi Henrik, > Thanks a lot for the help! > Sorry I have more questions. I am following "How to: Calculate total copy > number ratios from total (non-polymorphic) signals" and "Vignette: Total > copy-number segmentation (non-paired CBS)", but I am not sure if I do it > correctly. > I have two SNP6 datasets Tumor and HapMap270 and I want to use HpaMap270 as > reference to go all the way to CBS step. So I do the following steps > respectively. > > ds1 <- doCRMAv2("HapMap270", chipType="GenomeWideSNP_6,Full") > > ds2 <- doCRMAv2("Tumor", chipType="GenomeWideSNP_6,Full") > After that, I do > > dataSet <- "HapMap270" > > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY" > > chipType <- "GenomeWideSNP_6" > > dsN <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, > chipType=chipType) > > dataSet <- "Tumor" > > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY" > > chipType <- "GenomeWideSNP_6" > > dsT <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, > chipType=chipType) > > dfR <- getAverageFile(dsN) # ? > > dsTR <- exportTotalCnRatioSet(dsT, ref=dfR) # ? > Would the above two steps work? My question is how to go ahead from here to > do CBS and will sm <- CbsModel(dsTR) work? Skip the exportTotalCnRatioSet() call, and instead use: sm <- CbsModel(dsTR, ref=dfR, tags="HapMapRef"); The 'tags' is just to add an informative tag to the output data set. /Henrik > Thanks again for your help! > Sean > > > > > On Sun, Jul 28, 2013 at 4:28 PM, Henrik Bengtsson > wrote: >> >> Hi. >> >> On Fri, Jul 26, 2013 at 8:02 AM, sean nj wrote: >> > Hi guys, >> > >> > I have a question regarding how to calculate raw copy numbers using >> > common >> > reference instead of average of all samples of the study. Basically I >> > want >> > to use average of HapMap270 samples as reference for all further copy >> > number >> > calculations. >> > >> > I have a bunch HapMap270 snp6 cel files and I followed Vignette: >> > Estimation >> > of total copy numbers using the CRMA v2 method (10K-CytoScanHD) to Step >> > 5 - >> > Calculation of raw copy numbers, and generated ceR and saved it as a >> > RData >> > file ceR.Rdata. >> >> It's important to understand that almost all objects in the Aroma >> framework are basically "pointers" to external files. For instance, >> your 'ceR', which I assume you've got from something like ceR <- >> getAverageFile(ces), is referring to the file with pathname >> getPathname(ceR). More below... >> >> > >> > My first question is, how to use this data for any future copy number >> > analysis? My guess is that instead of calculating the ceR from the >> > sample >> > set I can just load the ceR.RData file I saved and use it. Right? >> >> First of all, please note that when do ceR <- getAverageFile(ces) on >> the same data set 'ces', the result is already available on file and >> it will be quickly found and returned. In other words, it will not >> recalcuate the averages again [unless you do ceR <- >> getAverageFile(ces, force=TRUE)]. >> >> However, I do understand that you may not want to have to keep a large >> 'ces' data set around, when you're only interested in the pooled >> average. In that case, I would copy the file containing the "average" >> to a new data set. Currently, this is not straightforward in Aroma >> (I'll think about something), but you can do the following: >> >> # Calculate the pooled average >> > ceR <- getAverageFile(cesN); >> >> # Copy this file to plmData/HapMap270,pooled/GenomeWideSNP_6/, e.g. >> > filename <- getFilename(ceR); >> > filename >> [1] ".average-intensities-median-mad,d03faaf8b707a97c4e43381b1a5d1ef2.CEL" >> > rootPath <- getParent(getPath(cesN), depth=2L); >> > dataSet <- "HapMap270,pooled"; >> > chipType <- getChipType(ceR, fullname=FALSE); >> > path <- file.path(rootPath, dataSet, chipType); >> > path >> [1] "plmData/HapMap270,pooled/GenomeWideSNP_6" >> > mkdirs(path); >> > copyFile(getPathname(ceR), file.path(path, filename)); >> >> With this done, you can then grab this pooled reference as: >> >> > library("aroma.affymetrix") >> > path <- "plmData/HapMap270,pooled/GenomeWideSNP_6"; >> > filename <- >> > ".average-intensities-median-mad,d03faaf8b707a97c4e43381b1a5d1ef2.CEL"; >> > ceR <- CnChipEffectFile(filename, path=path); >> >> Note, when you save 'ceR', you are basically saving the reference to >> the file. Yes, you can load it later, but make sure not to move it, >> otherwise you'll get some type of "file not found" error. >> >> > saveObject(ceR, "HapMap270,GenomeWideSNP_6,reference.Rdata"); >> >> If already saved, and file not moved, you can then do: >> >> > library("aroma.affymetrix"); >> > ceR <- loadObject("HapMap270,GenomeWideSNP_6,reference.Rdata"); >> >> All this is very ad hoc (=non-aroma style), and as I said, I'll see if >> I can come up with a cleaner solution for storing and retrieving >> pooled averages. >> >> > >> > My second question is, how to go ahead from there to calculate

Sorry, one more question regarding CBS. During fit(sm, verbose=-10) Building tuple of reference sets... Type of reference: median No reference available. Calculating average copy-number signals... Retrieving average unit signals across 291 arrays... .. I thought I used HapMap270 data as reference, why here it says no reference available and uses average instead? Thanks, Sean On Thu, Aug 1, 2013 at 10:56 AM, ying chen wrote: > Sorry, one more question regarding CBS. During > > fit(sm, verbose=-10) > Building tuple of reference sets... > Type of reference: median > No reference available. > Calculating average copy-number signals... > Retrieving average unit signals across 291 arrays... > .. > > > On Thu, Aug 1, 2013 at 10:53 AM, ying chen wrote: > >> Hi Henrik, >> >> I tried method I mentioned above. But I got an error message when running >> fit(sm, verbose=-10). >> >> >> >> Array #1 ('321T') of 291 on chromosome 1... >> Error in UseMethod("getChecksum") : >>no applicable method for 'getChecksum' applied to an object of class >> "list" >> >> What did I do wrong? >> >> Thanks a lot for the help! >> >> Sean >> >> >> > library(aroma.affymetrix) >> >> > dataSet <- "HapMap270" >> > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY" >> > chipType <- "GenomeWideSNP_6" >> > dsN <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, >> chipType=chipType) >> > dataSet <- "Tumor" >> > dsT <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, >> chipType=chipType) >> > dfR <- getAverageFile(dsN) >> > dsTR <- exportTotalCnRatioSet(dsT, ref=dfR) >> There were 50 or more warnings (use warnings() to see the first 50) >> > warnings() >> Warning messages: >> 1: In log(C) : NaNs produced >> 2: In log(C) : NaNs produced >> 3: In log(C) : NaNs produced >> ... >> > print(dsTR) >> AromaUnitTotalCnBinarySet: >> Name: Tumor >> Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY >> Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY >> Number of files: 291 >> Names: 321T, 322T, 323T, ..., T97 [291] >> Path (to the first file): >> rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6 >> Total file size: 2088.72 MB >> RAM: 0.37MB >> > print(dsT) >> AromaUnitTotalCnBinarySet: >> Name: Tumor >> Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY >> Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY >> Number of files: 291 >> Names: 321T, 322T, 323T, ..., T97 [291] >> Path (to the first file): >> totalAndFracBData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6 >> Total file size: 2088.64 MB >> RAM: 0.37MB >> > >> > sm <- CbsModel(dsTR) >> > >> Loading required package: DNAcopy >> > print(sm) >> CbsModel: >> Name: Tumor >> Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY,paired >> Chip type (virtual): GenomeWideSNP_6 >> Path: >> cbsData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY,paired/GenomeWideSNP_6 >> Number of chip types: 1 >> Sample & reference file pairs: >> Chip type #1 ('GenomeWideSNP_6') of 1: >> Sample data set: >> AromaUnitTotalCnBinarySet: >> Name: Tumor >> Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY >> Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY >> Number of files: 291 >> Names: 321T, 322T, 323T, ..., T97 [291] >> Path (to the first file): >> rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6 >> Total file size: 2088.72 MB >> RAM: 0.37MB >> Reference: >> RAM: 0.00MB >> > >> > fit(sm, verbose=-10) >> Building tuple of reference sets... >> Type of reference: median >> No reference available. >> Calculating average copy-number signals... >> Retrieving average unit signals across 291 arrays... >>AromaUnitTotalCnBinaryFile: >>Name: .average-signals-median-mad >>Tags: 8143f776d59b4c432debaf4bb288 >>Full name: .average-signals-median-mad,8143f776d59b4c432debaf4bb288 >>Pathname: >> rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6/.average-signals-median-mad,8143f776d59b4c432debaf4bb288.asb >>File size: 7.18 MB (7526027 bytes) >>RAM: 0.00 MB >>Number of data rows: 1881415 >>File format: v1 >>Dimensions: 1881415x1 >>Column classes: double >>Number of bytes per column: 4 >>Footer: 20130801 10:20:33 >> EDTAffymetrixGenomeWideSNP_6,Full29160c565ccf6239a081e4162b3328e1ccbmedianmad >>Platform: Affymetrix >>Chip type: GenomeWideSNP_6,Full >>Number of units to be updated: 1 >>Processing chunk... >> chr "Indices in chunk:" >> int 22933 >> Reading data... >> Reading data...done >> Estimating averages and standard deviations... >> Estimating averages and standard deviations...done >> Writing estimates... >> Writing estimates...done >>Processing chunk...done >> Retrieving average unit signals across 291 arrays...done >> Calculating average copy-number signals...done >> Building tuple of reference sets...done >> Using reference tuple: >> AromaUnitTotalCnBinarySetTuple: >> Name: Tumor >> Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY >> C

Sorry, one more question regarding CBS. During fit(sm, verbose=-10) Building tuple of reference sets... Type of reference: median No reference available. Calculating average copy-number signals... Retrieving average unit signals across 291 arrays... .. On Thu, Aug 1, 2013 at 10:53 AM, ying chen wrote: > Hi Henrik, > > I tried method I mentioned above. But I got an error message when running > fit(sm, verbose=-10). > > > > Array #1 ('321T') of 291 on chromosome 1... > Error in UseMethod("getChecksum") : >no applicable method for 'getChecksum' applied to an object of class > "list" > > What did I do wrong? > > Thanks a lot for the help! > > Sean > > > > library(aroma.affymetrix) > > > dataSet <- "HapMap270" > > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY" > > chipType <- "GenomeWideSNP_6" > > dsN <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, > chipType=chipType) > > dataSet <- "Tumor" > > dsT <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, > chipType=chipType) > > dfR <- getAverageFile(dsN) > > dsTR <- exportTotalCnRatioSet(dsT, ref=dfR) > There were 50 or more warnings (use warnings() to see the first 50) > > warnings() > Warning messages: > 1: In log(C) : NaNs produced > 2: In log(C) : NaNs produced > 3: In log(C) : NaNs produced > ... > > print(dsTR) > AromaUnitTotalCnBinarySet: > Name: Tumor > Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY > Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY > Number of files: 291 > Names: 321T, 322T, 323T, ..., T97 [291] > Path (to the first file): > rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6 > Total file size: 2088.72 MB > RAM: 0.37MB > > print(dsT) > AromaUnitTotalCnBinarySet: > Name: Tumor > Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY > Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY > Number of files: 291 > Names: 321T, 322T, 323T, ..., T97 [291] > Path (to the first file): > totalAndFracBData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6 > Total file size: 2088.64 MB > RAM: 0.37MB > > > > sm <- CbsModel(dsTR) > > > Loading required package: DNAcopy > > print(sm) > CbsModel: > Name: Tumor > Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY,paired > Chip type (virtual): GenomeWideSNP_6 > Path: > cbsData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY,paired/GenomeWideSNP_6 > Number of chip types: 1 > Sample & reference file pairs: > Chip type #1 ('GenomeWideSNP_6') of 1: > Sample data set: > AromaUnitTotalCnBinarySet: > Name: Tumor > Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY > Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY > Number of files: 291 > Names: 321T, 322T, 323T, ..., T97 [291] > Path (to the first file): > rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6 > Total file size: 2088.72 MB > RAM: 0.37MB > Reference: > RAM: 0.00MB > > > > fit(sm, verbose=-10) > Building tuple of reference sets... > Type of reference: median > No reference available. > Calculating average copy-number signals... > Retrieving average unit signals across 291 arrays... >AromaUnitTotalCnBinaryFile: >Name: .average-signals-median-mad >Tags: 8143f776d59b4c432debaf4bb288 >Full name: .average-signals-median-mad,8143f776d59b4c432debaf4bb288 >Pathname: > rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6/.average-signals-median-mad,8143f776d59b4c432debaf4bb288.asb >File size: 7.18 MB (7526027 bytes) >RAM: 0.00 MB >Number of data rows: 1881415 >File format: v1 >Dimensions: 1881415x1 >Column classes: double >Number of bytes per column: 4 >Footer: 20130801 10:20:33 > EDTAffymetrixGenomeWideSNP_6,Full29160c565ccf6239a081e4162b3328e1ccbmedianmad >Platform: Affymetrix >Chip type: GenomeWideSNP_6,Full >Number of units to be updated: 1 >Processing chunk... > chr "Indices in chunk:" > int 22933 > Reading data... > Reading data...done > Estimating averages and standard deviations... > Estimating averages and standard deviations...done > Writing estimates... > Writing estimates...done >Processing chunk...done > Retrieving average unit signals across 291 arrays...done > Calculating average copy-number signals...done > Building tuple of reference sets...done > Using reference tuple: > AromaUnitTotalCnBinarySetTuple: > Name: Tumor > Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY > Chip types: GenomeWideSNP_6 > AromaUnitTotalCnBinarySet: > Name: Tumor > Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY > Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY > Number of files: 291 > Names: .average-signals-median-mad, .average-signals-median-mad, > .average-signals-median-mad, ..., .average-signals-median-mad [291] > Path (to the first file): > rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6 > Total file size: 2088.62 MB > RAM: 0.37MB > RAM: 0.00MB > Extract DataFileMatrix... > Array: 1 > Test data sets: > AromaUnitTotalCnBinarySetTuple: > Name: Tumor > Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,F

Hi Henrik, I tried method I mentioned above. But I got an error message when running fit(sm, verbose=-10). Array #1 ('321T') of 291 on chromosome 1... Error in UseMethod("getChecksum") : no applicable method for 'getChecksum' applied to an object of class "list" What did I do wrong? Thanks a lot for the help! Sean > library(aroma.affymetrix) > dataSet <- "HapMap270" > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY" > chipType <- "GenomeWideSNP_6" > dsN <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, chipType=chipType) > dataSet <- "Tumor" > dsT <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, chipType=chipType) > dfR <- getAverageFile(dsN) > dsTR <- exportTotalCnRatioSet(dsT, ref=dfR) There were 50 or more warnings (use warnings() to see the first 50) > warnings() Warning messages: 1: In log(C) : NaNs produced 2: In log(C) : NaNs produced 3: In log(C) : NaNs produced ... > print(dsTR) AromaUnitTotalCnBinarySet: Name: Tumor Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Number of files: 291 Names: 321T, 322T, 323T, ..., T97 [291] Path (to the first file): rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6 Total file size: 2088.72 MB RAM: 0.37MB > print(dsT) AromaUnitTotalCnBinarySet: Name: Tumor Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Number of files: 291 Names: 321T, 322T, 323T, ..., T97 [291] Path (to the first file): totalAndFracBData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6 Total file size: 2088.64 MB RAM: 0.37MB > > sm <- CbsModel(dsTR) > Loading required package: DNAcopy > print(sm) CbsModel: Name: Tumor Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY,paired Chip type (virtual): GenomeWideSNP_6 Path: cbsData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY,paired/GenomeWideSNP_6 Number of chip types: 1 Sample & reference file pairs: Chip type #1 ('GenomeWideSNP_6') of 1: Sample data set: AromaUnitTotalCnBinarySet: Name: Tumor Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Number of files: 291 Names: 321T, 322T, 323T, ..., T97 [291] Path (to the first file): rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6 Total file size: 2088.72 MB RAM: 0.37MB Reference: RAM: 0.00MB > > fit(sm, verbose=-10) Building tuple of reference sets... Type of reference: median No reference available. Calculating average copy-number signals... Retrieving average unit signals across 291 arrays... AromaUnitTotalCnBinaryFile: Name: .average-signals-median-mad Tags: 8143f776d59b4c432debaf4bb288 Full name: .average-signals-median-mad,8143f776d59b4c432debaf4bb288 Pathname: rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6/.average-signals-median-mad,8143f776d59b4c432debaf4bb288.asb File size: 7.18 MB (7526027 bytes) RAM: 0.00 MB Number of data rows: 1881415 File format: v1 Dimensions: 1881415x1 Column classes: double Number of bytes per column: 4 Footer: 20130801 10:20:33 EDTAffymetrixGenomeWideSNP_6,Full29160c565ccf6239a081e4162b3328e1ccbmedianmad Platform: Affymetrix Chip type: GenomeWideSNP_6,Full Number of units to be updated: 1 Processing chunk... chr "Indices in chunk:" int 22933 Reading data... Reading data...done Estimating averages and standard deviations... Estimating averages and standard deviations...done Writing estimates... Writing estimates...done Processing chunk...done Retrieving average unit signals across 291 arrays...done Calculating average copy-number signals...done Building tuple of reference sets...done Using reference tuple: AromaUnitTotalCnBinarySetTuple: Name: Tumor Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Chip types: GenomeWideSNP_6 AromaUnitTotalCnBinarySet: Name: Tumor Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Number of files: 291 Names: .average-signals-median-mad, .average-signals-median-mad, .average-signals-median-mad, ..., .average-signals-median-mad [291] Path (to the first file): rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6 Total file size: 2088.62 MB RAM: 0.37MB RAM: 0.00MB Extract DataFileMatrix... Array: 1 Test data sets: AromaUnitTotalCnBinarySetTuple: Name: Tumor Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Chip types: GenomeWideSNP_6 AromaUnitTotalCnBinarySet: Name: Tumor Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Number of files: 291 Names: 321T, 322T, 323T, ..., T97 [291] Path (to the first file): rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6 Total file size: 2088.72 MB RAM: 0.37MB RAM: 0.00MB Test data files: $`GenomeWideSNP_6,Full` AromaUnitTotalCnBinaryFile: Name: 321T Tags: ref=.average-signals-median-mad,899bb16b33895d8872112e64f5eb74b7,log2ratio,total Full name: 321T,ref=.average-signals-median-mad

Hi Henrik, Thanks a lot for the help! Sorry I have more questions. I am following "How to: Calculate total copy number ratios from total (non-polymorphic) signals" and "Vignette: Total copy-number segmentation (non-paired CBS)", but I am not sure if I do it correctly. I have two SNP6 datasets Tumor and HapMap270 and I want to use HpaMap270 as reference to go all the way to CBS step. So I do the following steps respectively. > ds1 <- doCRMAv2("HapMap270", chipType="GenomeWideSNP_6,Full") > ds2 <- doCRMAv2("Tumor", chipType="GenomeWideSNP_6,Full") After that, I do > dataSet <- "HapMap270" > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY" > chipType <- "GenomeWideSNP_6" > dsN <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, chipType=chipType) > dataSet <- "Tumor" > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY" > chipType <- "GenomeWideSNP_6" > dsT <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, chipType=chipType) > dfR <- getAverageFile(dsN) # ? > dsTR <- exportTotalCnRatioSet(dsT, ref=dfR) # ? Would the above two steps work? My question is how to go ahead from here to do CBS and will sm <- CbsModel(dsTR) work? Thanks again for your help! Sean On Sun, Jul 28, 2013 at 4:28 PM, Henrik Bengtsson < henrik.bengts...@aroma-project.org> wrote: > Hi. > > On Fri, Jul 26, 2013 at 8:02 AM, sean nj wrote: > > Hi guys, > > > > I have a question regarding how to calculate raw copy numbers using > common > > reference instead of average of all samples of the study. Basically I > want > > to use average of HapMap270 samples as reference for all further copy > number > > calculations. > > > > I have a bunch HapMap270 snp6 cel files and I followed Vignette: > Estimation > > of total copy numbers using the CRMA v2 method (10K-CytoScanHD) to Step > 5 - > > Calculation of raw copy numbers, and generated ceR and saved it as a > RData > > file ceR.Rdata. > > It's important to understand that almost all objects in the Aroma > framework are basically "pointers" to external files. For instance, > your 'ceR', which I assume you've got from something like ceR <- > getAverageFile(ces), is referring to the file with pathname > getPathname(ceR). More below... > > > > > My first question is, how to use this data for any future copy number > > analysis? My guess is that instead of calculating the ceR from the sample > > set I can just load the ceR.RData file I saved and use it. Right? > > First of all, please note that when do ceR <- getAverageFile(ces) on > the same data set 'ces', the result is already available on file and > it will be quickly found and returned. In other words, it will not > recalcuate the averages again [unless you do ceR <- > getAverageFile(ces, force=TRUE)]. > > However, I do understand that you may not want to have to keep a large > 'ces' data set around, when you're only interested in the pooled > average. In that case, I would copy the file containing the "average" > to a new data set. Currently, this is not straightforward in Aroma > (I'll think about something), but you can do the following: > > # Calculate the pooled average > > ceR <- getAverageFile(cesN); > > # Copy this file to plmData/HapMap270,pooled/GenomeWideSNP_6/, e.g. > > filename <- getFilename(ceR); > > filename > [1] ".average-intensities-median-mad,d03faaf8b707a97c4e43381b1a5d1ef2.CEL" > > rootPath <- getParent(getPath(cesN), depth=2L); > > dataSet <- "HapMap270,pooled"; > > chipType <- getChipType(ceR, fullname=FALSE); > > path <- file.path(rootPath, dataSet, chipType); > > path > [1] "plmData/HapMap270,pooled/GenomeWideSNP_6" > > mkdirs(path); > > copyFile(getPathname(ceR), file.path(path, filename)); > > With this done, you can then grab this pooled reference as: > > > library("aroma.affymetrix") > > path <- "plmData/HapMap270,pooled/GenomeWideSNP_6"; > > filename <- > ".average-intensities-median-mad,d03faaf8b707a97c4e43381b1a5d1ef2.CEL"; > > ceR <- CnChipEffectFile(filename, path=path); > > Note, when you save 'ceR', you are basically saving the reference to > the file. Yes, you can load it later, but make sure not to move it, > otherwise you'll get some type of "file not found" error. > > > saveObject(ceR, "HapMap270,GenomeWideSNP_6,reference.Rdata"); > > If already saved, and file not moved, you can then do: > > > library("aroma.affymetrix"); > > ceR <- loadObject("HapMap270,GenomeWideSNP_6,reference.Rdata"); > > All this is very ad hoc (=non-aroma style), and as I said, I'll see if > I can come up with a cleaner solution for storing and retrieving > pooled averages. > > > > > My second question is, how to go ahead from there to calculate the > relative > > copy numbers for all unit from all samples? The two examples given in the > > Vignette are for one unit from one sample and for a few unit on > chromosome 2 > > for one sample. What is the function to retrieve all units on all > > chromosomes instead of units <- getUnitsOnChromosome(gi, chromosome=2, > > region=c(81,86)*1e6)? > > You can set 'units'

Hi. On Fri, Jul 26, 2013 at 8:02 AM, sean nj wrote: > Hi guys, > > I have a question regarding how to calculate raw copy numbers using common > reference instead of average of all samples of the study. Basically I want > to use average of HapMap270 samples as reference for all further copy number > calculations. > > I have a bunch HapMap270 snp6 cel files and I followed Vignette: Estimation > of total copy numbers using the CRMA v2 method (10K-CytoScanHD) to Step 5 - > Calculation of raw copy numbers, and generated ceR and saved it as a RData > file ceR.Rdata. It's important to understand that almost all objects in the Aroma framework are basically "pointers" to external files. For instance, your 'ceR', which I assume you've got from something like ceR <- getAverageFile(ces), is referring to the file with pathname getPathname(ceR). More below... > > My first question is, how to use this data for any future copy number > analysis? My guess is that instead of calculating the ceR from the sample > set I can just load the ceR.RData file I saved and use it. Right? First of all, please note that when do ceR <- getAverageFile(ces) on the same data set 'ces', the result is already available on file and it will be quickly found and returned. In other words, it will not recalcuate the averages again [unless you do ceR <- getAverageFile(ces, force=TRUE)]. However, I do understand that you may not want to have to keep a large 'ces' data set around, when you're only interested in the pooled average. In that case, I would copy the file containing the "average" to a new data set. Currently, this is not straightforward in Aroma (I'll think about something), but you can do the following: # Calculate the pooled average > ceR <- getAverageFile(cesN); # Copy this file to plmData/HapMap270,pooled/GenomeWideSNP_6/, e.g. > filename <- getFilename(ceR); > filename [1] ".average-intensities-median-mad,d03faaf8b707a97c4e43381b1a5d1ef2.CEL" > rootPath <- getParent(getPath(cesN), depth=2L); > dataSet <- "HapMap270,pooled"; > chipType <- getChipType(ceR, fullname=FALSE); > path <- file.path(rootPath, dataSet, chipType); > path [1] "plmData/HapMap270,pooled/GenomeWideSNP_6" > mkdirs(path); > copyFile(getPathname(ceR), file.path(path, filename)); With this done, you can then grab this pooled reference as: > library("aroma.affymetrix") > path <- "plmData/HapMap270,pooled/GenomeWideSNP_6"; > filename <- > ".average-intensities-median-mad,d03faaf8b707a97c4e43381b1a5d1ef2.CEL"; > ceR <- CnChipEffectFile(filename, path=path); Note, when you save 'ceR', you are basically saving the reference to the file. Yes, you can load it later, but make sure not to move it, otherwise you'll get some type of "file not found" error. > saveObject(ceR, "HapMap270,GenomeWideSNP_6,reference.Rdata"); If already saved, and file not moved, you can then do: > library("aroma.affymetrix"); > ceR <- loadObject("HapMap270,GenomeWideSNP_6,reference.Rdata"); All this is very ad hoc (=non-aroma style), and as I said, I'll see if I can come up with a cleaner solution for storing and retrieving pooled averages. > > My second question is, how to go ahead from there to calculate the relative > copy numbers for all unit from all samples? The two examples given in the > Vignette are for one unit from one sample and for a few unit on chromosome 2 > for one sample. What is the function to retrieve all units on all > chromosomes instead of units <- getUnitsOnChromosome(gi, chromosome=2, > region=c(81,86)*1e6)? You can set 'units' to NULL to retrieve all loci, i.e. no need to use getUnitsOnChromosome(). FYI, units <- NULL will give the same data as with units <- 1:nbrOfUnits(gi). > And what is the function to retrieve all samples > instead of ce <- getFile(cesN, indexOf(cesN, "NA06985"))? Hmm... not clear what you mean. All samples are in 'cesN', and you do need to iterate over them somehow. Is this what you're looking for? for (ii in seq_along(cesN)) { ce <- getFile(cesN, ii) ... } Or are you asking how to extract the data from all samples? Then you can do: theta <- extractTheta(cesN, units=units) but be careful because that loads a lot of data into memory. Hope this helps, Henrik > > Thanks a lot for the help, > > Sean > > -- > -- > When reporting problems on aroma.affymetrix, make sure 1) to run the latest > version of the package, 2) to report the output of sessionInfo() and > traceback(), and 3) to post a complete code example. > > > You received this message because you are subscribed to the Google Groups > "aroma.affymetrix" group with website http://www.aroma-project.org/. > To post to this group, send email to aroma-affymetrix@googlegroups.com > To unsubscribe and other options, go to http://www.aroma-project.org/forum/ > > --- > You received this message because you are subscribed to the Google Groups > "aroma.affymetrix" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to aroma-affymetrix