Re: [aroma.affymetrix] Problem with extracting subset of samples for 500K

Henrik Bengtsson Mon, 19 Jul 2010 00:49:00 -0700

Hi,

answer below.


On Fri, Jul 16, 2010 at 4:48 AM, biobee <manishab...@gmail.com> wrote:
> I am doing an upaired copy number analysis with 500K arrays using the
> CRMAv2. I want to use a subset of my samples the reference, instead of
> the all the samples as robust average for CBS segmentation.So I want
> to run CBS as CbsModel(ref,tumor). I am stuck at the point where I
> cannot extract the subset of samples for reference properly.Since the
> CnChipEffectSet is a list of two enzyme set, how do I extract out the
> subset of samples for both enzymes
>
> I have also listed the error I get below:
>
>  Here is the code I run:
>
> library(aroma.affymetrix)
>
> log <- Arguments$getVerbose(-4, timestamp=TRUE)
>
> dataSetName<- "Test"
> chipTypes <- c("Mapping250K_Nsp","Mapping250K_Sty")
>
> cdfs <- lapply(chipTypes, FUN=function(chipType) {
>  AffymetrixCdfFile$byChipType(chipType)
> })
> print(cdfs)
>
> gis <- lapply(cdfs, getGenomeInformation)
> print(gis)
>
> sis <- lapply(cdfs, getSnpInformation)
> print(sis)
>
> acs <- lapply(lapply(cdfs, getChipType), FUN=function(chipType) {
> AromaCellSequenceFile$byChipType(chipType)})
> print(acs)
>
> cesNList <- list()
> chipType <- chipTypes[1]
> cs <- AffymetrixCelSet$byName(dataSetName,chipType=chipType)
> cs <- extract(cs,!isDuplicated(cs))
> print(cs)
>
> acc <- AllelicCrosstalkCalibration(cs,model="CRMAv2");
>  print(acc);
>  csAcc <- process(acc, verbose=log);
>  bpn <- BasePositionNormalization(csAcc,target="zero")
>  print(bpn)
>  csN <- process(bpn,verbose=log)
> plm <- RmaCnPlm(csN,mergeStrands=TRUE, combineAlleles=TRUE, shift=
> +300);
>  print(plm);
>  fit(plm, verbose=log);
>
>  if (length(findUnitsTodo(plm)) > 0) {
>  units <- fitCnProbes(plm, verbose=verbose)
>  str(units)
>   units <- fit(plm, verbose=verbose)
>  str(units)
>   }
>   ces <- getChipEffectSet(plm)
>   print(ces)
>
>   fln <- FragmentLengthNormalization(ces, target="zero")
>   print(fln)
>
> cesNList[[chipType]] <- process(fln, verbose=verbose)
> ############################################################################
>   # This is for the second chipType
>    chipType <- chipTypes[2]
>    cs <- AffymetrixCelSet$byName(dataSetName,chipType=chipType)
>    cs <- extract(cs,!isDuplicated(cs))
>    print(cs)
>
>   acc <- AllelicCrosstalkCalibration(cs,model="CRMAv2");
>   print(acc);
>   csAcc <- process(acc, verbose=log);
>
>  bpn <- BasePositionNormalization(csAcc,target="zero")
>  print(bpn)
>  csN <- process(bpn,verbose=log);
>
>  plm <- RmaCnPlm(csN,mergeStrands=TRUE, combineAlleles=TRUE, shift=
> +300);
>  print(plm);
>  fit(plm, verbose=log);
> if (length(findUnitsTodo(plm)) > 0) {
>  units <- fitCnProbes(plm, verbose=verbose)
>  str(units)
>  units <- fit(plm, verbose=verbose)
>  str(units)
>   }
>   ces <- getChipEffectSet(plm)
>   print(ces);
>
>   ces <- getChipEffectSet(plm)
>   print(ces)
>
>   fln <- FragmentLengthNormalization(ces, target="zero")
>   print(fln)
>
>   cesNList[[chipType]] <- process(fln, verbose=verbose)
> ============================================================
> cesNList is my CnChipEffectSet
> Now my CnChipEffectSet: looks like this
>
> $Mapping250K_Nsp
> CnChipEffectSet:
> Name: Test
> Tags: ACC,-XY,BPN,-XY,RMA,+300,A+B,FLN,-XY
> Path: plmData/Test,ACC,-XY,BPN,-XY,RMA,+300,A+B,FLN,-XY/
> Mapping250K_Nsp
> Platform: Affymetrix
> Chip type: Mapping250K_Nsp,monocell
> Number of arrays: 80
> Names: AA-HGG024, AA-HGG045, ..., OT-HGG155
> Time period: 2010-07-14 15:46:23 -- 2010-07-14 15:46:40
> Total file size: 764.52MB
> RAM: 0.14MB
> Parameters: (probeModel: chr "pm", mergeStrands: logi TRUE,
> combineAlleles: logi TRUE)
>
> $Mapping250K_Sty
> CnChipEffectSet:
> Name: Test
> Tags: ACC,-XY,BPN,-XY,RMA,+300,A+B,FLN,-XY
> Path: plmData/Test,ACC,-XY,BPN,-XY,RMA,+300,A+B,FLN,-XY/
> Mapping250K_Sty
> Platform: Affymetrix
> Chip type: Mapping250K_Sty,monocell
> Number of arrays: 79
> Names: AA-HGG024, AA-HGG045, ..., OT-HGG155
> Time period: 2010-07-15 17:19:56 -- 2010-07-15 17:20:13
> Total file size: 687.18MB
> RAM: 0.14MB
> Parameters: (probeModel: chr "pm", mergeStrands: logi TRUE,
> combineAlleles: logi TRUE)
> =================================================================
> To extract the subset of arrays from cesNList
>
> RefCes <- extract(cesNList, 49:77)
>
> Error in list(`extract(cesNList, 49:77)` = <environment>,
> `extract.default(cesNList, 49:77)` = <environment>,  :
>
> [2010-07-15 22:05:21] Exception: Do not know how to unwrap object:
> list
>  at throw(Exception(...))
>  at throw.default("Do not know how to unwrap object: ", class(x)[1])
>  at throw("Do not know how to unwrap object: ", class(x)[1])
>  at extract.default(cesNList, 49:77)
>  at extract(cesNList, 49:77)
> ==============================================================
> I also tried to first extract the array names I want to use as
> reference set and I did this:
> for (chiptype in names(cesNList)) {
>         ces      <- cesNList[[chiptype]];
>         cesnames <- getNames(ces); }
>
> refcols<- cesnames[49:77]
> extract(cesNList,refcols)
>  extract(cesNList,refcols)
> Error in list(`extract(cesNList, refcols)` = <environment>,
> `extract.default(cesNList, refcols)` = <environment>,  :
>
> [2010-07-15 22:34:24] Exception: Do not know how to unwrap object:
> list
>  at throw(Exception(...))
>  at throw.default("Do not know how to unwrap object: ", class(x)[1])
>  at throw("Do not know how to unwrap object: ", class(x)[1])
>  at extract.default(cesNList, refcols)
>  at extract(cesNList, refcols)
>

You need to apply extract() to each list element separately (as you
would have done if there would only be one chip type/enzyme involved):

cesRList <- lapply(cesNList, extract, 49:77);

or more explictly:

cesRList <- lapply(cesNList, FUN=function(ces) {
  extract(ces, 49:77);
});

This will return a list of two element, where each element is a
CnChipEffectSet containing (77-49+1)=29 arrays.


> Since the CnChipEffectSet is a list of two enzyme set, how do I
> extract out the subset of samples for both enzymes and do the
> downstream analysis such as:
>
> ceR <- getAverageFile(RefCes, verbose=verbose)
> cesT <- extract(cesNList,1:48)
> cbs <- CbsModel(ceR, cesT)

If think you flipped a things around here, but I assume you wish to
use the pool of all reference samples as the reference for calculating
copy numbers of your test/tumor samples.  Then, do this:

ceRList <- lapply(cesRList, FUN=getAverageFile);

This will you a list of two CnChipEffectFile:s, each the robust
average of your 29 reference samples.  Note that you could have done
the above two steps in one go as:

ceRList <- lapply(cesNList, FUN=function(ces) {
  cesR <- extract(ces, 49:77);
  getAverageFile(cesR);
});

Now you have your reference signals.  Next, pull out your test/tumor signals:

cesTList <- lapply(cesNList, FUN=function(ces) {
  extract(ces, 1:48);
});

which gives a list of two CnChipEffectSet:s each containing 48
samples.   To segment these 48 samples using the above (pooled)
reference for calculating the CN ratios, do:

cbs <- CbsModel(cesTList, ceRList)

That's all.

Let me know if it works for you.

/Henrik

>
> ==============================================================
> attached base packages:
> [1] stats     graphics  grDevices datasets  utils     methods
> base
>
> other attached packages:
>  [1] DNAcopy_1.18.0         preprocessCore_1.6.0
> aroma.affymetrix_1.5.0
>  [4] aroma.apd_0.1.7        affxparser_1.16.0
> R.huge_0.2.0
>  [7] aroma.core_1.5.0       aroma.light_1.16.0
> matrixStats_0.2.1
> [10] R.rsp_0.3.6            R.cache_0.3.0
> R.filesets_0.8.1
> [13] digest_0.4.2           R.utils_1.4.0
> R.oo_1.7.2
> [16] R.methodsS3_1.2.0
>
> Thanks
> Manisha
>
>
> --
> When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
> version of the package, 2) to report the output of sessionInfo() and 
> traceback(), and 3) to post a complete code example.
>
>
> You received this message because you are subscribed to the Google Groups 
> "aroma.affymetrix" group with website http://www.aroma-project.org/.
> To post to this group, send email to aroma-affymetrix@googlegroups.com
> To unsubscribe and other options, go to http://www.aroma-project.org/forum/
>

-- 
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group with website http://www.aroma-project.org/.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe and other options, go to http://www.aroma-project.org/forum/

Re: [aroma.affymetrix] Problem with extracting subset of samples for 500K

Reply via email to