Re: [aroma.affymetrix] Raw copy numbers calculation using common reference instead of average of all samples

ying chen Thu, 01 Aug 2013 07:57:33 -0700

Sorry, one more question regarding CBS. During

fit(sm, verbose=-10)
Building tuple of reference sets...
 Type of reference: median
 No reference available.
 Calculating average copy-number signals...
  Retrieving average unit signals across 291 arrays...
......



On Thu, Aug 1, 2013 at 10:53 AM, ying chen <njs...@gmail.com> wrote:

> Hi Henrik,
>
> I tried method I mentioned above. But I got an error message when running
> fit(sm, verbose=-10).
>
>
>
>  Array #1 ('321T') of 291 on chromosome 1...
>  Error in UseMethod("getChecksum") :
>    no applicable method for 'getChecksum' applied to an object of class
> "list"
>
> What did I do wrong?
>
> Thanks a lot for the help!
>
> Sean
>
>
> > library(aroma.affymetrix)
>
> > dataSet <- "HapMap270"
> > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY"
> > chipType <- "GenomeWideSNP_6"
> > dsN <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags,
> chipType=chipType)
> > dataSet <- "Tumor"
> > dsT <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags,
> chipType=chipType)
> > dfR <- getAverageFile(dsN)
> > dsTR <- exportTotalCnRatioSet(dsT, ref=dfR)
> There were 50 or more warnings (use warnings() to see the first 50)
> > warnings()
> Warning messages:
> 1: In log(C) : NaNs produced
> 2: In log(C) : NaNs produced
> 3: In log(C) : NaNs produced
> ...
> > print(dsTR)
> AromaUnitTotalCnBinarySet:
> Name: Tumor
> Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
> Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
> Number of files: 291
> Names: 321T, 322T, 323T, ..., T97 [291]
> Path (to the first file):
> rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6
> Total file size: 2088.72 MB
> RAM: 0.37MB
> > print(dsT)
> AromaUnitTotalCnBinarySet:
> Name: Tumor
> Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
> Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
> Number of files: 291
> Names: 321T, 322T, 323T, ..., T97 [291]
> Path (to the first file):
> totalAndFracBData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6
> Total file size: 2088.64 MB
> RAM: 0.37MB
> >
> > sm <- CbsModel(dsTR)
> >
> Loading required package: DNAcopy
> > print(sm)
> CbsModel:
> Name: Tumor
> Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY,paired
> Chip type (virtual): GenomeWideSNP_6
> Path:
> cbsData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY,paired/GenomeWideSNP_6
> Number of chip types: 1
> Sample & reference file pairs:
> Chip type #1 ('GenomeWideSNP_6') of 1:
> Sample data set:
> AromaUnitTotalCnBinarySet:
> Name: Tumor
> Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
> Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
> Number of files: 291
> Names: 321T, 322T, 323T, ..., T97 [291]
> Path (to the first file):
> rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6
> Total file size: 2088.72 MB
> RAM: 0.37MB
> Reference: <median of samples>
> RAM: 0.00MB
> >
> > fit(sm, verbose=-10)
> Building tuple of reference sets...
>  Type of reference: median
>  No reference available.
>  Calculating average copy-number signals...
>   Retrieving average unit signals across 291 arrays...
>    AromaUnitTotalCnBinaryFile:
>    Name: .average-signals-median-mad
>    Tags: 8143f7733336d59b4c432debaf4bb288
>    Full name: .average-signals-median-mad,8143f7733336d59b4c432debaf4bb288
>    Pathname:
> rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6/.average-signals-median-mad,8143f7733336d59b4c432debaf4bb288.asb
>    File size: 7.18 MB (7526027 bytes)
>    RAM: 0.00 MB
>    Number of data rows: 1881415
>    File format: v1
>    Dimensions: 1881415x1
>    Column classes: double
>    Number of bytes per column: 4
>    Footer: <createdOn>20130801 10:20:33
> EDT</createdOn><platform>Affymetrix</platform><chipType>GenomeWideSNP_6,Full</chipType><srcDetails><nbrOfFiles>291</nbrOfFiles><checkSum>60c565ccf6239a081e4162b3328e1ccb</checkSum></srcDetails><params><meanName>median</meanName><sdName>mad</sdName></params>
>    Platform: Affymetrix
>    Chip type: GenomeWideSNP_6,Full
>    Number of units to be updated: 1
>    Processing chunk...
>      chr "Indices in chunk:"
>      int 22933
>     Reading data...
>     Reading data...done
>     Estimating averages and standard deviations...
>     Estimating averages and standard deviations...done
>     Writing estimates...
>     Writing estimates...done
>    Processing chunk...done
>   Retrieving average unit signals across 291 arrays...done
>  Calculating average copy-number signals...done
> Building tuple of reference sets...done
> Using reference tuple:
> AromaUnitTotalCnBinarySetTuple:
> Name: Tumor
> Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
> Chip types: GenomeWideSNP_6
> AromaUnitTotalCnBinarySet:
> Name: Tumor
> Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
> Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
> Number of files: 291
> Names: .average-signals-median-mad, .average-signals-median-mad,
> .average-signals-median-mad, ..., .average-signals-median-mad [291]
> Path (to the first file):
> rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6
> Total file size: 2088.62 MB
> RAM: 0.37MB
> RAM: 0.00MB
> Extract DataFileMatrix...
>  Array: 1
>  Test data sets:
>  AromaUnitTotalCnBinarySetTuple:
>  Name: Tumor
>  Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
>  Chip types: GenomeWideSNP_6
>  AromaUnitTotalCnBinarySet:
>  Name: Tumor
>  Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
>  Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
>  Number of files: 291
>  Names: 321T, 322T, 323T, ..., T97 [291]
>  Path (to the first file):
> rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6
>  Total file size: 2088.72 MB
>  RAM: 0.37MB
>  RAM: 0.00MB
>  Test data files:
>  $`GenomeWideSNP_6,Full`
>  AromaUnitTotalCnBinaryFile:
>  Name: 321T
>  Tags:
> ref=.average-signals-median-mad,899bb16b33895d8872112e64f5eb74b7,log2ratio,total
>  Full name:
> 321T,ref=.average-signals-median-mad,899bb16b33895d8872112e64f5eb74b7,log2ratio,total
>  Pathname:
> rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6/321T,ref=.average-signals-median-mad,899bb16b33895d8872112e64f5eb74b7,log2ratio,total.asb
>  File size: 7.18 MB (7526390 bytes)
>  RAM: 0.00 MB
>  Number of data rows: 1881415
>  File format: v1
>  Dimensions: 1881415x1
>  Column classes: double
>  Number of bytes per column: 4
>  Footer: <createdOn>20130801 10:08:04
> EDT</createdOn><platform>Affymetrix</platform><chipType>GenomeWideSNP_6,Full</chipType><srcFiles><srcFile><dataSet>Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY</dataSet><fullName>321T,total</fullName><filename>321T,total.asb</filename><checksum>d588ec42546855b2a1fd6e68d7181af5</checksum></srcFile><refFile><dataSet>Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY</dataSet><fullName>.average-signals-median-mad,899bb16b33895d8872112e64f5eb74b7</fullName><filename>.average-signals-median-mad,899bb16b33895d8872112e64f5eb74b7.asb</filename><checksum>8a54b83f1c8856778e47648db09bfda9</checksum></refFile></srcFiles>
>  Platform: Affymetrix
>  Chip type: GenomeWideSNP_6,Full
>
>  attr(,"class")
>  [1] "AromaUnitTotalCnBinaryFileList" "GenericDataFileList"
>  [3] "list"
>  Type of reference: median
>  Reference data sets:
>  AromaUnitTotalCnBinarySetTuple:
>  Name: Tumor
>  Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
>  Chip types: GenomeWideSNP_6
>  AromaUnitTotalCnBinarySet:
>  Name: Tumor
>  Tags: ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
>  Full name: Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY
>  Number of files: 291
>  Names: .average-signals-median-mad, .average-signals-median-mad,
> .average-signals-median-mad, ..., .average-signals-median-mad [291]
>  Path (to the first file):
> rawCnData/Tumor,ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY/GenomeWideSNP_6
>  Total file size: 2088.62 MB
>  RAM: 0.37MB
>  RAM: 0.00MB
> Extract DataFileMatrix...done
> Genomic-signal tags:
> ref=.average-signals-median-mad,899bb16b33895d8872112e64f5eb74b7,log2ratio,total
> Reference tags: 688d239f3db29d92d513e604879b915d
> Array #1 ('321T') of 291 on chromosome 1...
> Error in UseMethod("getChecksum") :
>   no applicable method for 'getChecksum' applied to an object of class
> "list"
> Array #1 ('321T') of 291 on chromosome 1...done
> >
> >
> > sessionInfo()
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
>  [1] LC_CTYPE=en_US       LC_NUMERIC=C         LC_TIME=en_US
>  [4] LC_COLLATE=en_US     LC_MONETARY=en_US    LC_MESSAGES=en_US
>  [7] LC_PAPER=C           LC_NAME=C            LC_ADDRESS=C
> [10] LC_TELEPHONE=C       LC_MEASUREMENT=en_US LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
>  [1] DNAcopy_1.34.0         aroma.affymetrix_2.9.0 affxparser_1.32.1
>  [4] aroma.apd_0.2.3        R.huge_0.4.1           aroma.light_1.30.2
>  [7] aroma.core_2.9.0       matrixStats_0.8.1      R.rsp_0.8.2
> [10] R.devices_2.2.2        R.filesets_2.0.1       R.utils_1.23.2
> [13] R.oo_1.13.0            R.methodsS3_1.4.4
>
> loaded via a namespace (and not attached):
> [1] digest_0.6.3  PSCBS_0.34.8  R.cache_0.6.5
> >
> > traceback()
> No traceback available
> >
>
>
>
>
>
> On Mon, Jul 29, 2013 at 10:45 PM, ying chen <njs...@gmail.com> wrote:
>
>> Hi Henrik,
>> Thanks a lot for the help!
>> Sorry I have more questions. I am following "How to: Calculate total copy
>> number ratios from total (non-polymorphic) signals" and "Vignette: Total
>> copy-number segmentation (non-paired CBS)", but I am not sure if I do it
>> correctly.
>> I have two SNP6 datasets Tumor and HapMap270 and I want to use HpaMap270
>> as reference to go all the way to CBS step. So I do the following steps
>> respectively.
>>   > ds1 <- doCRMAv2("HapMap270", chipType="GenomeWideSNP_6,Full")
>>   > ds2 <- doCRMAv2("Tumor", chipType="GenomeWideSNP_6,Full")
>> After that, I do
>>  > dataSet <- "HapMap270"
>>  > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY"
>>  > chipType <- "GenomeWideSNP_6"
>>  > dsN <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags,
>> chipType=chipType)
>>  > dataSet <- "Tumor"
>>  > tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY"
>>  > chipType <- "GenomeWideSNP_6"
>>  > dsT <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags,
>> chipType=chipType)
>>  > dfR <- getAverageFile(dsN)   # ?
>>  > dsTR <- exportTotalCnRatioSet(dsT, ref=dfR) # ?
>> Would the above two steps work? My question is how to go ahead from here
>> to do CBS and will sm <- CbsModel(dsTR) work?
>> Thanks again for your help!
>> Sean
>>
>>
>>
>>
>> On Sun, Jul 28, 2013 at 4:28 PM, Henrik Bengtsson <
>> henrik.bengts...@aroma-project.org> wrote:
>>
>>> Hi.
>>>
>>> On Fri, Jul 26, 2013 at 8:02 AM, sean nj <njs...@gmail.com> wrote:
>>> > Hi guys,
>>> >
>>> > I have a question regarding how to calculate raw copy numbers using
>>> common
>>> > reference instead of average of all samples of the study. Basically I
>>> want
>>> > to use average of HapMap270 samples as reference for all further copy
>>> number
>>> > calculations.
>>> >
>>> > I have a bunch HapMap270 snp6 cel files and I followed Vignette:
>>> Estimation
>>> > of total copy numbers using the CRMA v2 method (10K-CytoScanHD) to
>>> Step 5 -
>>> > Calculation of raw copy numbers, and generated ceR and saved it as a
>>> RData
>>> > file ceR.Rdata.
>>>
>>> It's important to understand that almost all objects in the Aroma
>>> framework are basically "pointers" to external files.  For instance,
>>> your 'ceR', which I assume you've got from something like ceR <-
>>> getAverageFile(ces), is referring to the file with pathname
>>> getPathname(ceR).  More below...
>>>
>>> >
>>> > My first question is, how to use this data for any future copy number
>>> > analysis? My guess is that instead of calculating the ceR from the
>>> sample
>>> > set I can just load the ceR.RData file I saved and use it. Right?
>>>
>>> First of all, please note that when do ceR <- getAverageFile(ces) on
>>> the same data set 'ces', the result is already available on file and
>>> it will be quickly found and returned.  In other words, it will not
>>> recalcuate the averages again [unless you do ceR <-
>>> getAverageFile(ces, force=TRUE)].
>>>
>>> However, I do understand that you may not want to have to keep a large
>>> 'ces' data set around, when you're only interested in the pooled
>>> average.  In that case, I would copy the file containing the "average"
>>> to a new data set.  Currently, this is not straightforward in Aroma
>>> (I'll think about something), but you can do the following:
>>>
>>> # Calculate the pooled average
>>> > ceR <- getAverageFile(cesN);
>>>
>>> # Copy this file to plmData/HapMap270,pooled/GenomeWideSNP_6/, e.g.
>>> > filename <- getFilename(ceR);
>>> > filename
>>> [1]
>>> ".average-intensities-median-mad,d03faaf8b707a97c4e43381b1a5d1ef2.CEL"
>>> > rootPath <- getParent(getPath(cesN), depth=2L);
>>> > dataSet <- "HapMap270,pooled";
>>> > chipType <- getChipType(ceR, fullname=FALSE);
>>> > path <- file.path(rootPath, dataSet, chipType);
>>> > path
>>> [1] "plmData/HapMap270,pooled/GenomeWideSNP_6"
>>> > mkdirs(path);
>>> > copyFile(getPathname(ceR), file.path(path, filename));
>>>
>>> With this done, you can then grab this pooled reference as:
>>>
>>> > library("aroma.affymetrix")
>>> > path <- "plmData/HapMap270,pooled/GenomeWideSNP_6";
>>> > filename <-
>>> ".average-intensities-median-mad,d03faaf8b707a97c4e43381b1a5d1ef2.CEL";
>>> > ceR <- CnChipEffectFile(filename, path=path);
>>>
>>> Note, when you save 'ceR', you are basically saving the reference to
>>> the file.  Yes, you can load it later, but make sure not to move it,
>>> otherwise you'll get some type of "file not found" error.
>>>
>>> > saveObject(ceR, "HapMap270,GenomeWideSNP_6,reference.Rdata");
>>>
>>> If already saved, and file not moved, you can then do:
>>>
>>> > library("aroma.affymetrix");
>>> > ceR <- loadObject("HapMap270,GenomeWideSNP_6,reference.Rdata");
>>>
>>> All this is very ad hoc (=non-aroma style), and as I said, I'll see if
>>> I can come up with a cleaner solution for storing and retrieving
>>> pooled averages.
>>>
>>> >
>>> > My second question is, how to go ahead from there to calculate the
>>> relative
>>> > copy numbers for all unit from all samples? The two examples given in
>>> the
>>> > Vignette  are for one unit from one sample and for a few unit on
>>> chromosome 2
>>> > for one sample. What is the function to retrieve all units on all
>>> > chromosomes instead of units <- getUnitsOnChromosome(gi, chromosome=2,
>>> > region=c(81,86)*1e6)?
>>>
>>> You can set 'units' to NULL to retrieve all loci, i.e. no need to use
>>> getUnitsOnChromosome().  FYI, units <- NULL will give the same data as
>>> with units <- 1:nbrOfUnits(gi).
>>>
>>> > And what is the function to retrieve all samples
>>> > instead of ce <- getFile(cesN, indexOf(cesN, "NA06985"))?
>>>
>>> Hmm... not clear what you mean.  All samples are in 'cesN', and you do
>>> need to iterate over them somehow.  Is this what you're looking for?
>>>
>>> for (ii in seq_along(cesN)) {
>>>   ce <- getFile(cesN, ii)
>>>   ...
>>> }
>>>
>>> Or are you asking how to extract the data from all samples?  Then you
>>> can do:
>>>
>>> theta <- extractTheta(cesN, units=units)
>>>
>>> but be careful because that loads a lot of data into memory.
>>>
>>> Hope this helps,
>>>
>>> Henrik
>>>
>>> >
>>> > Thanks a lot for the help,
>>> >
>>> > Sean
>>> >
>>> > --
>>> > --
>>> > When reporting problems on aroma.affymetrix, make sure 1) to run the
>>> latest
>>> > version of the package, 2) to report the output of sessionInfo() and
>>> > traceback(), and 3) to post a complete code example.
>>> >
>>> >
>>> > You received this message because you are subscribed to the Google
>>> Groups
>>> > "aroma.affymetrix" group with website http://www.aroma-project.org/.
>>> > To post to this group, send email to aroma-affymetrix@googlegroups.com
>>> > To unsubscribe and other options, go to
>>> http://www.aroma-project.org/forum/
>>> >
>>> > ---
>>> > You received this message because you are subscribed to the Google
>>> Groups
>>> > "aroma.affymetrix" group.
>>> > To unsubscribe from this group and stop receiving emails from it, send
>>> an
>>> > email to aroma-affymetrix+unsubscr...@googlegroups.com.
>>> > For more options, visit https://groups.google.com/groups/opt_out.
>>> >
>>> >
>>>
>>> --
>>> --
>>> When reporting problems on aroma.affymetrix, make sure 1) to run the
>>> latest version of the package, 2) to report the output of sessionInfo() and
>>> traceback(), and 3) to post a complete code example.
>>>
>>>
>>> You received this message because you are subscribed to the Google
>>> Groups "aroma.affymetrix" group with website
>>> http://www.aroma-project.org/.
>>> To post to this group, send email to aroma-affymetrix@googlegroups.com
>>> To unsubscribe and other options, go to
>>> http://www.aroma-project.org/forum/
>>>
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "aroma.affymetrix" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to aroma-affymetrix+unsubscr...@googlegroups.com.
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>
>>>
>>>
>>
>

-- 
-- 
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group with website http://www.aroma-project.org/.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe and other options, go to http://www.aroma-project.org/forum/

--- 
You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to aroma-affymetrix+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Re: [aroma.affymetrix] Raw copy numbers calculation using common reference instead of average of all samples

Reply via email to