Hi Vonn.

On Fri, Sep 3, 2010 at 10:47 AM, Vonn <vwal...@email.unc.edu> wrote:
> Hi All,
>
> I'm interested in doing some basic quality control with Affy SNP6.0
> chips.  In particular, I'd like to get summary statistics and plot
> densities of the intensity values for each array both before and after
> normalization (allelic crosstalk calibration and base position
> normalization).  Plotting the densities pre/post-normalization is no
> problem.  Moreover, I can use extractAffyBatch() to get the intensity
> values and summary statistics before normalizing.  However, I get an
> error about a corrupted CEL file when I try to use extractAffyBatch()
> post-normalization.
>
> Is there a way to extract the intensity values without using
> extractAffyBatch()?

Actually, it does not make sense to try to extract an AffyBatch from a
ChipEffectSet.  The reason is that an AffyBatch should hold
probe-level signals, while a ChipEffectSet contains summarized
signals.  In the next release of aroma.affymetrix, extractAffyBatch()
will throw a more informative error message explaining this.

I'd like to ask you what do you want to do in the next step, because
depending on what you want to do, there are different ways of
extract/working with/exporting the summarized signals.

For instance, you may want to use the extractDataFrame() method, cf.

  http://www.aroma-project.org/howtos/extractDataFrame

Notes: This will load all requested data into memory, so you probably
want to subset by units and/or samples when using that method.  Also,
the first time you use this method for a new chip type, it will take
some time, because it first has to do some formatting of the
annotation data.  Just be patient; you can use verbose=-50 to see
what's going on.

See also extractTheta(), cf.

  http://www.aroma-project.org/howtos/extractTheta

If you want to export your data to tab-delimited files, there is also
writeDataFrame(), cf.

  http:/www.aroma-project.org/howtos/writeDataFrame

There are more options, but I leave it there until I know what you are
really after.

Hope this helps

Henrik

>
> My code, traceback, and sessionInfo are below.
>
> Thanks in advance,
>
> Vonn
>
>> #Load aroma
>> library(aroma.affymetrix)
> Loading required package: R.utils
> Loading required package: R.oo
> Loading required package: R.methodsS3
> R.methodsS3 v1.2.0 (2010-03-13) successfully loaded. See ?R.methodsS3
> for help.
> R.oo v1.7.3 (2010-06-04) successfully loaded. See ?R.oo for help.
> R.utils v1.5.0 (2010-08-04) successfully loaded. See ?R.utils for
> help.
> Loading required package: R.filesets
> Loading required package: digest
> R.filesets v0.8.3 (2010-07-06) successfully loaded. See ?R.filesets
> for help.
> Loading required package: aroma.core
> Loading required package: R.cache
> R.cache v0.3.0 (2010-03-13) successfully loaded. See ?R.cache for
> help.
> Loading required package: R.rsp
> R.rsp v0.3.6 (2009-09-16) successfully loaded. See ?R.rsp for help.
>  Type browseRsp() to open the RSP main menu in your browser.
> Loading required package: matrixStats
> matrixStats v0.2.1 (2010-04-05) successfully loaded. See ?matrixStats
> for help.
> Loading required package: aroma.light
> aroma.light v1.16.1 (2010-06-23) successfully loaded. See ?aroma.light
> for help.
> aroma.core v1.7.0 (2010-07-26) successfully loaded. See ?aroma.core
> for help.
> Loading required package: aroma.apd
> Loading required package: R.huge
> R.huge v0.2.0 (2009-10-16) successfully loaded. See ?R.huge for help.
> Loading required package: affxparser
> aroma.apd v0.1.7 (2009-10-16) successfully loaded. See ?aroma.apd for
> help.
> aroma.affymetrix v1.7.0 (2010-07-26) successfully loaded. See ?
> aroma.affymetrix for help.
>> log = verbose = Arguments$getVerbose(-8, timestamp = TRUE)
>> options(digits = 4)
>>
>> #Test to make sure things are working
>> cdf = AffymetrixCdfFile$byChipType("GenomeWideSNP_6", tags = "Full")
>> print(cdf)
> AffymetrixCdfFile:
> Path: annotationData/chipTypes/GenomeWideSNP_6
> Filename: GenomeWideSNP_6,Full.cdf
> Filesize: 470.44MB
> Chip type: GenomeWideSNP_6,Full
> RAM: 0.00MB
> File format: v4 (binary; XDA)
> Dimension: 2572x2680
> Number of cells: 6892960
> Number of units: 1881415
> Cells per unit: 3.66
> Number of QC units: 4
>>
>> gi = getGenomeInformation(cdf)
>> print(gi)
> UgpGenomeInformation:
> Name: GenomeWideSNP_6
> Tags: Full,na30,hg18,HB20100215
> Full name: GenomeWideSNP_6,Full,na30,hg18,HB20100215
> Pathname: annotationData/chipTypes/GenomeWideSNP_6/
> GenomeWideSNP_6,Full,na30,hg18,HB20100215.ugp
> File size: 8.97 MB (9407867 bytes)
> RAM: 0.00 MB
> Chip type: GenomeWideSNP_6,Full
>>
>> si = getSnpInformation(cdf)
>> print(si)
> UflSnpInformation:
> Name: GenomeWideSNP_6
> Tags: Full,na30,hg18,HB20100215
> Full name: GenomeWideSNP_6,Full,na30,hg18,HB20100215
> Pathname: annotationData/chipTypes/GenomeWideSNP_6/
> GenomeWideSNP_6,Full,na30,hg18,HB20100215.ufl
> File size: 7.18 MB (7526452 bytes)
> RAM: 0.00 MB
> Chip type: GenomeWideSNP_6,Full
> Number of enzymes: 2
>>
>> acs = AromaCellSequenceFile$byChipType(getChipType(cdf, fullname = FALSE))
>> print(acs)
> AromaCellSequenceFile:
> Name: GenomeWideSNP_6
> Tags: HB20080710
> Full name: GenomeWideSNP_6,HB20080710
> Pathname: annotationData/chipTypes/GenomeWideSNP_6/
> GenomeWideSNP_6,HB20080710.acs
> File size: 170.92 MB (179217531 bytes)
> RAM: 0.00 MB
> Number of data rows: 6892960
> File format: v1
> Dimensions: 6892960x26
> Column classes: raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw,
> raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw,
> raw
> Number of bytes per column: 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
> Footer: <createdOn>20080710 22:47:02 PDT</
> createdOn><platform>Affymetrix</platform><chipType>GenomeWideSNP_6</
> chipType><srcFile><filename>GenomeWideSNP_6.probe_tab</
> filename><filesize>341479928</
> filesize><checksum>2037c033c09fd8f7c06bd042a77aef15</checksum></
> srcFile><srcFile2><filename>GenomeWideSNP_6.CN_probe_tab</
> filename><filesize>96968290</
> filesize><checksum>3dc2d3178f5eafdbea9c8b6eca88a89c</checksum></
> srcFile2>
> Chip type: GenomeWideSNP_6
> Platform: Affymetrix
>>
>> #Load CEL files in the folder "testing."  Note that inside the working 
>> directory we have the
>> #path rawData//testing//GenomeWideSNP_6, which contains the appropriate CEL 
>> files.
>> csR = AffymetrixCelSet$byName("smalltesting1", cdf = cdf)
>> print(csR)
> AffymetrixCelSet:
> Name: smalltesting1
> Tags:
> Path: rawData/smalltesting1/GenomeWideSNP_6
> Platform: Affymetrix
> Chip type: GenomeWideSNP_6,Full
> Number of arrays: 9
> Names: sample1, sample10, ..., sample9
> Time period: 2010-04-20 13:23:11 -- 2010-04-20 18:43:03
> Total file size: 592.90MB
> RAM: 0.01MB
>> #Create AffyBatch object for csR
>> ab = extractAffyBatch(csR)
> Loading required package: affy
> Loading required package: Biobase
>
> Welcome to Bioconductor
>
>  Vignettes contain introductory material. To view, type
>  'openVignette()'. To cite Bioconductor, see
>  'citation("Biobase")' and for packages 'citation(pkgname)'.
>
>
> Attaching package: 'Biobase'
>
> The following object(s) are masked from 'package:matrixStats':
>
>    anyMissing, rowMedians
>
>
> Attaching package: 'affy'
>
> The following object(s) are masked from 'package:aroma.light':
>
>    plotDensity
>
>
>        The following object(s) are masked _by_ package:Biobase :
>
>         .Depends
>
>
>        The following object(s) are masked _by_
> package:aroma.affymetrix :
>
>         .Depends
>
>
>        The following object(s) are masked _by_ package:aroma.apd :
>
>         .Depends
>
>
>        The following object(s) are masked _by_ package:R.huge :
>
>         .Depends
>
>
>        The following object(s) are masked _by_ package:aroma.core :
>
>         .Depends
>
>
>        The following object(s) are masked _by_ package:aroma.light :
>
>         .Depends plotDensity
>
>
>        The following object(s) are masked _by_ package:matrixStats :
>
>         .Depends
>
>
>        The following object(s) are masked _by_ package:R.rsp :
>
>         .Depends
>
>
>        The following object(s) are masked _by_ package:R.cache :
>
>         .Depends
>
>
>        The following object(s) are masked _by_ package:R.filesets :
>
>         .Depends
>
>
>        The following object(s) are masked _by_ package:R.utils :
>
>         .Depends
>
>
>        The following object(s) are masked _by_ package:R.oo :
>
>         .Depends
>
> Loading required package: genomewidesnp6,fullcdf
> Warning messages:
> 1: In fcn(...) : Packages reordered in search path: package:affy
> 2: In extractAffyBatch.AffymetrixCelSet(csR) :
>  CDF enviroment package 'genomewidesnp6,fullcdf' not installed. The
> 'affy' package will later try to download it from Bioconductor and
> install it.
>>
>> #########################
>> #########################
>> #####Now for normalization.  As per CMRAv2.0, we perform allelic
>> #####crosstalk calibration and normalization for nucleotide-position
>> #####probe sequence effects.
>> #########################
>> #########################
>> acc = AllelicCrosstalkCalibration(csR, model = "CRMAv2")
>> csC = process(acc, verbose = verbose)
> 20100903 13:35:10|Calibrating data set for allelic cross talk...
> 20100903 13:35:11| Already calibrated
> 20100903 13:35:11|Calibrating data set for allelic cross talk...done
>> bpn = BasePositionNormalization(csC, target = "zero")
>> csN = process(bpn, verbose = verbose)
> 20100903 13:35:11|Normalization data set for probe-sequence effects...
> 20100903 13:35:11| Already normalized
> 20100903 13:35:11|Normalization data set for probe-sequence
> effects...done
>>
>> #Now repeat the above QC steps based on the normalized data
>> abN = extractAffyBatch(csN)
> Loading required package: genomewidesnp6,fullcdf
> Error in read.affybatch(filenames = l$filenames, phenoData = l
> $phenoData,  :
>  It appears that the file probeData/smalltesting1,ACC,ra,-XY,BPN,-XY/
> GenomeWideSNP_6/sample1.CEL is corrupted.
> In addition: Warning message:
> In extractAffyBatch.AffymetrixCelSet(csN) :
>  CDF enviroment package 'genomewidesnp6,fullcdf' not installed. The
> 'affy' package will later try to download it from Bioconductor and
> install it.
>>
>> traceback()
> 5: .Call("read_abatch", filenames, rm.mask, rm.outliers, rm.extra,
>       ref.cdfName, dim.intensity, verbose, PACKAGE = "affyio")
> 4: read.affybatch(filenames = l$filenames, phenoData = l$phenoData,
>       description = l$description, notes = notes, compress =
> compress,
>       rm.mask = rm.mask, rm.outliers = rm.outliers, rm.extra =
> rm.extra,
>       verbose = verbose, sd = sd, cdfname = cdfname)
> 3: ReadAffy(filenames = filenames, sampleNames = sampleNames, ...,
>       verbose = as.logical(verbose))
> 2: extractAffyBatch.AffymetrixCelSet(csN)
> 1: extractAffyBatch(csN)
>> sessionInfo()
> R version 2.11.1 (2010-05-31)
> i386-pc-mingw32
>
> locale:
> [1] LC_COLLATE=English_United States.1252
> [2] LC_CTYPE=English_United States.1252
> [3] LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods
> base
>
> other attached packages:
>  [1] Biobase_2.6.1          aroma.affymetrix_1.7.0
> aroma.apd_0.1.7
>  [4] affxparser_1.18.0      R.huge_0.2.0
> aroma.core_1.7.0
>  [7] aroma.light_1.16.1     matrixStats_0.2.1
> R.rsp_0.3.6
> [10] R.cache_0.3.0          R.filesets_0.8.3
> digest_0.4.2
> [13] R.utils_1.5.0          R.oo_1.7.3
> affy_1.24.2
> [16] R.methodsS3_1.2.0
>
> loaded via a namespace (and not attached):
> [1] affyio_1.14.0        preprocessCore_1.8.0 tools_2.11.1
>>
>
> --
> When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
> version of the package, 2) to report the output of sessionInfo() and 
> traceback(), and 3) to post a complete code example.
>
>
> You received this message because you are subscribed to the Google Groups 
> "aroma.affymetrix" group with website http://www.aroma-project.org/.
> To post to this group, send email to aroma-affymetrix@googlegroups.com
> To unsubscribe and other options, go to http://www.aroma-project.org/forum/
>

-- 
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group with website http://www.aroma-project.org/.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe and other options, go to http://www.aroma-project.org/forum/

Reply via email to