Re: [aroma.affymetrix] Raw copy numbers calculation using common reference instead of average of all samples
Hi. On Tue, Aug 6, 2013 at 12:31 PM, ying chen njs...@gmail.com wrote: Hi Henrik, Sorry I have one more question to bug you. The urls in the df (by df - readDataFrame(db);) point to UCSCgenome browser with NCBI36/hg18 assembly. I looked at the http://www.aroma-project.org/chipTypes/GenomeWideSNP_6 page resource part, the SNP6 chip type files are listed as based on hg19 (Human Genome build v36.1). This is a typo, right? I thought it should be hg18. So the Cbs out put is also based on hg18, right? The genome version (hg18) in the URLs to the USCC Genome Browser are (still) hard coded. Please see: https://groups.google.com/d/msg/aroma-affymetrix/r8CtCvdGWnE/jzQFMifxt2MJ for more informations/details. /Henrik Thanks a lot for the help! Sean On Fri, Aug 2, 2013 at 11:03 AM, Henrik Bengtsson henrik.bengts...@aroma-project.org wrote: On Thu, Aug 1, 2013 at 6:37 PM, ying chen njs...@gmail.com wrote: Hi Henrik, Thanks a lot for the help. For sm - CbsModel(dsTR, ref=dfR, tags=HapMapRef), you mean sm - CbsModel(dsT, ref=dfR, tags=HapMapRef), right? Correct, I meant: sm - CbsModel(dsT, ref=dfR, tags=HapMapRef) /H Thanks, Sean On Thu, Aug 1, 2013 at 6:00 PM, Henrik Bengtsson henrik.bengts...@aroma-project.org wrote: On Mon, Jul 29, 2013 at 7:45 PM, ying chen njs...@gmail.com wrote: Hi Henrik, Thanks a lot for the help! Sorry I have more questions. I am following How to: Calculate total copy number ratios from total (non-polymorphic) signals and Vignette: Total copy-number segmentation (non-paired CBS), but I am not sure if I do it correctly. I have two SNP6 datasets Tumor and HapMap270 and I want to use HpaMap270 as reference to go all the way to CBS step. So I do the following steps respectively. ds1 - doCRMAv2(HapMap270, chipType=GenomeWideSNP_6,Full) ds2 - doCRMAv2(Tumor, chipType=GenomeWideSNP_6,Full) After that, I do dataSet - HapMap270 tags - ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY chipType - GenomeWideSNP_6 dsN - AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, chipType=chipType) dataSet - Tumor tags - ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY chipType - GenomeWideSNP_6 dsT - AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, chipType=chipType) dfR - getAverageFile(dsN) # ? dsTR - exportTotalCnRatioSet(dsT, ref=dfR) # ? Would the above two steps work? My question is how to go ahead from here to do CBS and will sm - CbsModel(dsTR) work? Skip the exportTotalCnRatioSet() call, and instead use: sm - CbsModel(dsTR, ref=dfR, tags=HapMapRef); The 'tags' is just to add an informative tag to the output data set. /Henrik Thanks again for your help! Sean On Sun, Jul 28, 2013 at 4:28 PM, Henrik Bengtsson henrik.bengts...@aroma-project.org wrote: Hi. On Fri, Jul 26, 2013 at 8:02 AM, sean nj njs...@gmail.com wrote: Hi guys, I have a question regarding how to calculate raw copy numbers using common reference instead of average of all samples of the study. Basically I want to use average of HapMap270 samples as reference for all further copy number calculations. I have a bunch HapMap270 snp6 cel files and I followed Vignette: Estimation of total copy numbers using the CRMA v2 method (10K-CytoScanHD) to Step 5 - Calculation of raw copy numbers, and generated ceR and saved it as a RData file ceR.Rdata. It's important to understand that almost all objects in the Aroma framework are basically pointers to external files. For instance, your 'ceR', which I assume you've got from something like ceR - getAverageFile(ces), is referring to the file with pathname getPathname(ceR). More below... My first question is, how to use this data for any future copy number analysis? My guess is that instead of calculating the ceR from the sample set I can just load the ceR.RData file I saved and use it. Right? First of all, please note that when do ceR - getAverageFile(ces) on the same data set 'ces', the result is already available on file and it will be quickly found and returned. In other words, it will not recalcuate the averages again [unless you do ceR - getAverageFile(ces, force=TRUE)]. However, I do understand that you may not want to have to keep a large 'ces' data set around, when you're only interested in the pooled average. In that case, I would copy the file containing the average to a new data set. Currently, this is not straightforward in Aroma (I'll think about something), but you can do the following: # Calculate the pooled average ceR - getAverageFile(cesN); # Copy this file to plmData/HapMap270,pooled/GenomeWideSNP_6/, e.g. filename - getFilename(ceR); filename [1]
Re: [aroma.affymetrix] Raw copy numbers calculation using common reference instead of average of all samples
Hi Henrik, I tried method I mentioned above. But I got an error message when running fit(sm, verbose=-10). Array #1 ('321T') of 291 on chromosome 1... Error in UseMethod(getChecksum) : no applicable method for 'getChecksum' applied to an object of class list What did I do wrong? Thanks a lot for the help! Sean library(aroma.affymetrix) dataSet - HapMap270 tags - ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY chipType - GenomeWideSNP_6 dsN - AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, chipType=chipType) dataSet - Tumor dsT - AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, chipType=chipType) dfR - getAverageFile(dsN) dsTR - exportTotalCnRatioSet(dsT, ref=dfR) There were 50 or more warnings (use warnings() to see the first 50) warnings() Warning messages: 1: In log(C) : NaNs produced 2: In log(C) : NaNs produced 3: In log(C) : NaNs produced ... print(dsTR) AromaUnitTotalCnBinarySet: Name: Tumor Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Number of files: 291 Names: 321T, 322T, 323T, ..., T97 [291] Path (to the first file): rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6 Total file size: 2088.72 MB RAM: 0.37MB print(dsT) AromaUnitTotalCnBinarySet: Name: Tumor Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Number of files: 291 Names: 321T, 322T, 323T, ..., T97 [291] Path (to the first file): totalAndFracBData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6 Total file size: 2088.64 MB RAM: 0.37MB sm - CbsModel(dsTR) Loading required package: DNAcopy print(sm) CbsModel: Name: Tumor Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY,paired Chip type (virtual): GenomeWideSNP_6 Path: cbsData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY,paired/GenomeWideSNP_6 Number of chip types: 1 Sample reference file pairs: Chip type #1 ('GenomeWideSNP_6') of 1: Sample data set: AromaUnitTotalCnBinarySet: Name: Tumor Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Number of files: 291 Names: 321T, 322T, 323T, ..., T97 [291] Path (to the first file): rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6 Total file size: 2088.72 MB RAM: 0.37MB Reference: median of samples RAM: 0.00MB fit(sm, verbose=-10) Building tuple of reference sets... Type of reference: median No reference available. Calculating average copy-number signals... Retrieving average unit signals across 291 arrays... AromaUnitTotalCnBinaryFile: Name: .average-signals-median-mad Tags: 8143f776d59b4c432debaf4bb288 Full name: .average-signals-median-mad,8143f776d59b4c432debaf4bb288 Pathname: rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6/.average-signals-median-mad,8143f776d59b4c432debaf4bb288.asb File size: 7.18 MB (7526027 bytes) RAM: 0.00 MB Number of data rows: 1881415 File format: v1 Dimensions: 1881415x1 Column classes: double Number of bytes per column: 4 Footer: createdOn20130801 10:20:33 EDT/createdOnplatformAffymetrix/platformchipTypeGenomeWideSNP_6,Full/chipTypesrcDetailsnbrOfFiles291/nbrOfFilescheckSum60c565ccf6239a081e4162b3328e1ccb/checkSum/srcDetailsparamsmeanNamemedian/meanNamesdNamemad/sdName/params Platform: Affymetrix Chip type: GenomeWideSNP_6,Full Number of units to be updated: 1 Processing chunk... chr Indices in chunk: int 22933 Reading data... Reading data...done Estimating averages and standard deviations... Estimating averages and standard deviations...done Writing estimates... Writing estimates...done Processing chunk...done Retrieving average unit signals across 291 arrays...done Calculating average copy-number signals...done Building tuple of reference sets...done Using reference tuple: AromaUnitTotalCnBinarySetTuple: Name: Tumor Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Chip types: GenomeWideSNP_6 AromaUnitTotalCnBinarySet: Name: Tumor Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Number of files: 291 Names: .average-signals-median-mad, .average-signals-median-mad, .average-signals-median-mad, ..., .average-signals-median-mad [291] Path (to the first file): rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6 Total file size: 2088.62 MB RAM: 0.37MB RAM: 0.00MB Extract DataFileMatrix... Array: 1 Test data sets: AromaUnitTotalCnBinarySetTuple: Name: Tumor Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Chip types: GenomeWideSNP_6 AromaUnitTotalCnBinarySet: Name: Tumor Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Number of files: 291 Names: 321T, 322T, 323T, ..., T97 [291] Path (to the first file): rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6 Total file size: 2088.72 MB RAM: 0.37MB RAM: 0.00MB Test data files: $`GenomeWideSNP_6,Full` AromaUnitTotalCnBinaryFile: Name: 321T
Re: [aroma.affymetrix] Raw copy numbers calculation using common reference instead of average of all samples
Sorry, one more question regarding CBS. During fit(sm, verbose=-10) Building tuple of reference sets... Type of reference: median No reference available. Calculating average copy-number signals... Retrieving average unit signals across 291 arrays... .. On Thu, Aug 1, 2013 at 10:53 AM, ying chen njs...@gmail.com wrote: Hi Henrik, I tried method I mentioned above. But I got an error message when running fit(sm, verbose=-10). Array #1 ('321T') of 291 on chromosome 1... Error in UseMethod(getChecksum) : no applicable method for 'getChecksum' applied to an object of class list What did I do wrong? Thanks a lot for the help! Sean library(aroma.affymetrix) dataSet - HapMap270 tags - ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY chipType - GenomeWideSNP_6 dsN - AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, chipType=chipType) dataSet - Tumor dsT - AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, chipType=chipType) dfR - getAverageFile(dsN) dsTR - exportTotalCnRatioSet(dsT, ref=dfR) There were 50 or more warnings (use warnings() to see the first 50) warnings() Warning messages: 1: In log(C) : NaNs produced 2: In log(C) : NaNs produced 3: In log(C) : NaNs produced ... print(dsTR) AromaUnitTotalCnBinarySet: Name: Tumor Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Number of files: 291 Names: 321T, 322T, 323T, ..., T97 [291] Path (to the first file): rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6 Total file size: 2088.72 MB RAM: 0.37MB print(dsT) AromaUnitTotalCnBinarySet: Name: Tumor Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Number of files: 291 Names: 321T, 322T, 323T, ..., T97 [291] Path (to the first file): totalAndFracBData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6 Total file size: 2088.64 MB RAM: 0.37MB sm - CbsModel(dsTR) Loading required package: DNAcopy print(sm) CbsModel: Name: Tumor Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY,paired Chip type (virtual): GenomeWideSNP_6 Path: cbsData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY,paired/GenomeWideSNP_6 Number of chip types: 1 Sample reference file pairs: Chip type #1 ('GenomeWideSNP_6') of 1: Sample data set: AromaUnitTotalCnBinarySet: Name: Tumor Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Number of files: 291 Names: 321T, 322T, 323T, ..., T97 [291] Path (to the first file): rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6 Total file size: 2088.72 MB RAM: 0.37MB Reference: median of samples RAM: 0.00MB fit(sm, verbose=-10) Building tuple of reference sets... Type of reference: median No reference available. Calculating average copy-number signals... Retrieving average unit signals across 291 arrays... AromaUnitTotalCnBinaryFile: Name: .average-signals-median-mad Tags: 8143f776d59b4c432debaf4bb288 Full name: .average-signals-median-mad,8143f776d59b4c432debaf4bb288 Pathname: rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6/.average-signals-median-mad,8143f776d59b4c432debaf4bb288.asb File size: 7.18 MB (7526027 bytes) RAM: 0.00 MB Number of data rows: 1881415 File format: v1 Dimensions: 1881415x1 Column classes: double Number of bytes per column: 4 Footer: createdOn20130801 10:20:33 EDT/createdOnplatformAffymetrix/platformchipTypeGenomeWideSNP_6,Full/chipTypesrcDetailsnbrOfFiles291/nbrOfFilescheckSum60c565ccf6239a081e4162b3328e1ccb/checkSum/srcDetailsparamsmeanNamemedian/meanNamesdNamemad/sdName/params Platform: Affymetrix Chip type: GenomeWideSNP_6,Full Number of units to be updated: 1 Processing chunk... chr Indices in chunk: int 22933 Reading data... Reading data...done Estimating averages and standard deviations... Estimating averages and standard deviations...done Writing estimates... Writing estimates...done Processing chunk...done Retrieving average unit signals across 291 arrays...done Calculating average copy-number signals...done Building tuple of reference sets...done Using reference tuple: AromaUnitTotalCnBinarySetTuple: Name: Tumor Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Chip types: GenomeWideSNP_6 AromaUnitTotalCnBinarySet: Name: Tumor Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY Number of files: 291 Names: .average-signals-median-mad, .average-signals-median-mad, .average-signals-median-mad, ..., .average-signals-median-mad [291] Path (to the first file): rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6 Total file size: 2088.62 MB RAM: 0.37MB RAM: 0.00MB Extract DataFileMatrix... Array: 1 Test data sets: AromaUnitTotalCnBinarySetTuple: Name: Tumor Tags:
Re: [aroma.affymetrix] Raw copy numbers calculation using common reference instead of average of all samples
On Mon, Jul 29, 2013 at 7:45 PM, ying chen njs...@gmail.com wrote: Hi Henrik, Thanks a lot for the help! Sorry I have more questions. I am following How to: Calculate total copy number ratios from total (non-polymorphic) signals and Vignette: Total copy-number segmentation (non-paired CBS), but I am not sure if I do it correctly. I have two SNP6 datasets Tumor and HapMap270 and I want to use HpaMap270 as reference to go all the way to CBS step. So I do the following steps respectively. ds1 - doCRMAv2(HapMap270, chipType=GenomeWideSNP_6,Full) ds2 - doCRMAv2(Tumor, chipType=GenomeWideSNP_6,Full) After that, I do dataSet - HapMap270 tags - ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY chipType - GenomeWideSNP_6 dsN - AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, chipType=chipType) dataSet - Tumor tags - ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY chipType - GenomeWideSNP_6 dsT - AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, chipType=chipType) dfR - getAverageFile(dsN) # ? dsTR - exportTotalCnRatioSet(dsT, ref=dfR) # ? Would the above two steps work? My question is how to go ahead from here to do CBS and will sm - CbsModel(dsTR) work? Skip the exportTotalCnRatioSet() call, and instead use: sm - CbsModel(dsTR, ref=dfR, tags=HapMapRef); The 'tags' is just to add an informative tag to the output data set. /Henrik Thanks again for your help! Sean On Sun, Jul 28, 2013 at 4:28 PM, Henrik Bengtsson henrik.bengts...@aroma-project.org wrote: Hi. On Fri, Jul 26, 2013 at 8:02 AM, sean nj njs...@gmail.com wrote: Hi guys, I have a question regarding how to calculate raw copy numbers using common reference instead of average of all samples of the study. Basically I want to use average of HapMap270 samples as reference for all further copy number calculations. I have a bunch HapMap270 snp6 cel files and I followed Vignette: Estimation of total copy numbers using the CRMA v2 method (10K-CytoScanHD) to Step 5 - Calculation of raw copy numbers, and generated ceR and saved it as a RData file ceR.Rdata. It's important to understand that almost all objects in the Aroma framework are basically pointers to external files. For instance, your 'ceR', which I assume you've got from something like ceR - getAverageFile(ces), is referring to the file with pathname getPathname(ceR). More below... My first question is, how to use this data for any future copy number analysis? My guess is that instead of calculating the ceR from the sample set I can just load the ceR.RData file I saved and use it. Right? First of all, please note that when do ceR - getAverageFile(ces) on the same data set 'ces', the result is already available on file and it will be quickly found and returned. In other words, it will not recalcuate the averages again [unless you do ceR - getAverageFile(ces, force=TRUE)]. However, I do understand that you may not want to have to keep a large 'ces' data set around, when you're only interested in the pooled average. In that case, I would copy the file containing the average to a new data set. Currently, this is not straightforward in Aroma (I'll think about something), but you can do the following: # Calculate the pooled average ceR - getAverageFile(cesN); # Copy this file to plmData/HapMap270,pooled/GenomeWideSNP_6/, e.g. filename - getFilename(ceR); filename [1] .average-intensities-median-mad,d03faaf8b707a97c4e43381b1a5d1ef2.CEL rootPath - getParent(getPath(cesN), depth=2L); dataSet - HapMap270,pooled; chipType - getChipType(ceR, fullname=FALSE); path - file.path(rootPath, dataSet, chipType); path [1] plmData/HapMap270,pooled/GenomeWideSNP_6 mkdirs(path); copyFile(getPathname(ceR), file.path(path, filename)); With this done, you can then grab this pooled reference as: library(aroma.affymetrix) path - plmData/HapMap270,pooled/GenomeWideSNP_6; filename - .average-intensities-median-mad,d03faaf8b707a97c4e43381b1a5d1ef2.CEL; ceR - CnChipEffectFile(filename, path=path); Note, when you save 'ceR', you are basically saving the reference to the file. Yes, you can load it later, but make sure not to move it, otherwise you'll get some type of file not found error. saveObject(ceR, HapMap270,GenomeWideSNP_6,reference.Rdata); If already saved, and file not moved, you can then do: library(aroma.affymetrix); ceR - loadObject(HapMap270,GenomeWideSNP_6,reference.Rdata); All this is very ad hoc (=non-aroma style), and as I said, I'll see if I can come up with a cleaner solution for storing and retrieving pooled averages. My second question is, how to go ahead from there to calculate the relative copy numbers for all unit from all samples? The two examples given in the Vignette are for one unit from one sample and for a few unit on chromosome 2 for one sample. What is the function to retrieve all units on all chromosomes instead of
Re: [aroma.affymetrix] Raw copy numbers calculation using common reference instead of average of all samples
Hi Henrik, Thanks a lot for the help! Sorry I have more questions. I am following How to: Calculate total copy number ratios from total (non-polymorphic) signals and Vignette: Total copy-number segmentation (non-paired CBS), but I am not sure if I do it correctly. I have two SNP6 datasets Tumor and HapMap270 and I want to use HpaMap270 as reference to go all the way to CBS step. So I do the following steps respectively. ds1 - doCRMAv2(HapMap270, chipType=GenomeWideSNP_6,Full) ds2 - doCRMAv2(Tumor, chipType=GenomeWideSNP_6,Full) After that, I do dataSet - HapMap270 tags - ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY chipType - GenomeWideSNP_6 dsN - AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, chipType=chipType) dataSet - Tumor tags - ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY chipType - GenomeWideSNP_6 dsT - AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, chipType=chipType) dfR - getAverageFile(dsN) # ? dsTR - exportTotalCnRatioSet(dsT, ref=dfR) # ? Would the above two steps work? My question is how to go ahead from here to do CBS and will sm - CbsModel(dsTR) work? Thanks again for your help! Sean On Sun, Jul 28, 2013 at 4:28 PM, Henrik Bengtsson henrik.bengts...@aroma-project.org wrote: Hi. On Fri, Jul 26, 2013 at 8:02 AM, sean nj njs...@gmail.com wrote: Hi guys, I have a question regarding how to calculate raw copy numbers using common reference instead of average of all samples of the study. Basically I want to use average of HapMap270 samples as reference for all further copy number calculations. I have a bunch HapMap270 snp6 cel files and I followed Vignette: Estimation of total copy numbers using the CRMA v2 method (10K-CytoScanHD) to Step 5 - Calculation of raw copy numbers, and generated ceR and saved it as a RData file ceR.Rdata. It's important to understand that almost all objects in the Aroma framework are basically pointers to external files. For instance, your 'ceR', which I assume you've got from something like ceR - getAverageFile(ces), is referring to the file with pathname getPathname(ceR). More below... My first question is, how to use this data for any future copy number analysis? My guess is that instead of calculating the ceR from the sample set I can just load the ceR.RData file I saved and use it. Right? First of all, please note that when do ceR - getAverageFile(ces) on the same data set 'ces', the result is already available on file and it will be quickly found and returned. In other words, it will not recalcuate the averages again [unless you do ceR - getAverageFile(ces, force=TRUE)]. However, I do understand that you may not want to have to keep a large 'ces' data set around, when you're only interested in the pooled average. In that case, I would copy the file containing the average to a new data set. Currently, this is not straightforward in Aroma (I'll think about something), but you can do the following: # Calculate the pooled average ceR - getAverageFile(cesN); # Copy this file to plmData/HapMap270,pooled/GenomeWideSNP_6/, e.g. filename - getFilename(ceR); filename [1] .average-intensities-median-mad,d03faaf8b707a97c4e43381b1a5d1ef2.CEL rootPath - getParent(getPath(cesN), depth=2L); dataSet - HapMap270,pooled; chipType - getChipType(ceR, fullname=FALSE); path - file.path(rootPath, dataSet, chipType); path [1] plmData/HapMap270,pooled/GenomeWideSNP_6 mkdirs(path); copyFile(getPathname(ceR), file.path(path, filename)); With this done, you can then grab this pooled reference as: library(aroma.affymetrix) path - plmData/HapMap270,pooled/GenomeWideSNP_6; filename - .average-intensities-median-mad,d03faaf8b707a97c4e43381b1a5d1ef2.CEL; ceR - CnChipEffectFile(filename, path=path); Note, when you save 'ceR', you are basically saving the reference to the file. Yes, you can load it later, but make sure not to move it, otherwise you'll get some type of file not found error. saveObject(ceR, HapMap270,GenomeWideSNP_6,reference.Rdata); If already saved, and file not moved, you can then do: library(aroma.affymetrix); ceR - loadObject(HapMap270,GenomeWideSNP_6,reference.Rdata); All this is very ad hoc (=non-aroma style), and as I said, I'll see if I can come up with a cleaner solution for storing and retrieving pooled averages. My second question is, how to go ahead from there to calculate the relative copy numbers for all unit from all samples? The two examples given in the Vignette are for one unit from one sample and for a few unit on chromosome 2 for one sample. What is the function to retrieve all units on all chromosomes instead of units - getUnitsOnChromosome(gi, chromosome=2, region=c(81,86)*1e6)? You can set 'units' to NULL to retrieve all loci, i.e. no need to use getUnitsOnChromosome(). FYI, units - NULL will give the same data as with units - 1:nbrOfUnits(gi). And what is the function to retrieve all samples
Re: [aroma.affymetrix] Raw copy numbers calculation using common reference instead of average of all samples
Hi. On Fri, Jul 26, 2013 at 8:02 AM, sean nj njs...@gmail.com wrote: Hi guys, I have a question regarding how to calculate raw copy numbers using common reference instead of average of all samples of the study. Basically I want to use average of HapMap270 samples as reference for all further copy number calculations. I have a bunch HapMap270 snp6 cel files and I followed Vignette: Estimation of total copy numbers using the CRMA v2 method (10K-CytoScanHD) to Step 5 - Calculation of raw copy numbers, and generated ceR and saved it as a RData file ceR.Rdata. It's important to understand that almost all objects in the Aroma framework are basically pointers to external files. For instance, your 'ceR', which I assume you've got from something like ceR - getAverageFile(ces), is referring to the file with pathname getPathname(ceR). More below... My first question is, how to use this data for any future copy number analysis? My guess is that instead of calculating the ceR from the sample set I can just load the ceR.RData file I saved and use it. Right? First of all, please note that when do ceR - getAverageFile(ces) on the same data set 'ces', the result is already available on file and it will be quickly found and returned. In other words, it will not recalcuate the averages again [unless you do ceR - getAverageFile(ces, force=TRUE)]. However, I do understand that you may not want to have to keep a large 'ces' data set around, when you're only interested in the pooled average. In that case, I would copy the file containing the average to a new data set. Currently, this is not straightforward in Aroma (I'll think about something), but you can do the following: # Calculate the pooled average ceR - getAverageFile(cesN); # Copy this file to plmData/HapMap270,pooled/GenomeWideSNP_6/, e.g. filename - getFilename(ceR); filename [1] .average-intensities-median-mad,d03faaf8b707a97c4e43381b1a5d1ef2.CEL rootPath - getParent(getPath(cesN), depth=2L); dataSet - HapMap270,pooled; chipType - getChipType(ceR, fullname=FALSE); path - file.path(rootPath, dataSet, chipType); path [1] plmData/HapMap270,pooled/GenomeWideSNP_6 mkdirs(path); copyFile(getPathname(ceR), file.path(path, filename)); With this done, you can then grab this pooled reference as: library(aroma.affymetrix) path - plmData/HapMap270,pooled/GenomeWideSNP_6; filename - .average-intensities-median-mad,d03faaf8b707a97c4e43381b1a5d1ef2.CEL; ceR - CnChipEffectFile(filename, path=path); Note, when you save 'ceR', you are basically saving the reference to the file. Yes, you can load it later, but make sure not to move it, otherwise you'll get some type of file not found error. saveObject(ceR, HapMap270,GenomeWideSNP_6,reference.Rdata); If already saved, and file not moved, you can then do: library(aroma.affymetrix); ceR - loadObject(HapMap270,GenomeWideSNP_6,reference.Rdata); All this is very ad hoc (=non-aroma style), and as I said, I'll see if I can come up with a cleaner solution for storing and retrieving pooled averages. My second question is, how to go ahead from there to calculate the relative copy numbers for all unit from all samples? The two examples given in the Vignette are for one unit from one sample and for a few unit on chromosome 2 for one sample. What is the function to retrieve all units on all chromosomes instead of units - getUnitsOnChromosome(gi, chromosome=2, region=c(81,86)*1e6)? You can set 'units' to NULL to retrieve all loci, i.e. no need to use getUnitsOnChromosome(). FYI, units - NULL will give the same data as with units - 1:nbrOfUnits(gi). And what is the function to retrieve all samples instead of ce - getFile(cesN, indexOf(cesN, NA06985))? Hmm... not clear what you mean. All samples are in 'cesN', and you do need to iterate over them somehow. Is this what you're looking for? for (ii in seq_along(cesN)) { ce - getFile(cesN, ii) ... } Or are you asking how to extract the data from all samples? Then you can do: theta - extractTheta(cesN, units=units) but be careful because that loads a lot of data into memory. Hope this helps, Henrik Thanks a lot for the help, Sean -- -- When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example. You received this message because you are subscribed to the Google Groups aroma.affymetrix group with website http://www.aroma-project.org/. To post to this group, send email to aroma-affymetrix@googlegroups.com To unsubscribe and other options, go to http://www.aroma-project.org/forum/ --- You received this message because you are subscribed to the Google Groups aroma.affymetrix group. To unsubscribe from this group and stop receiving emails from it, send an email to aroma-affymetrix+unsubscr...@googlegroups.com. For more options, visit