Hi Henrik,

Thanks for your response. Let me clarify what I'm trying to do: For each array I want to be able to compute basic summary statistics (min, median, max, etc.) of the probe-level intensities, both before and after normalization.

My goal here is to do some basic quality control. The CRMAv2 vignette shows how to produce density plots for the probe-level intensities both before and after normalization, and clearly this is useful. Using extractAffyBatch(), I can extract the probe-level intensities *before* normalization, and thus produce spatial plots of the intensities and compute the summary statistics. However, I'd like to be able to get the same information after normalization. Given that you're much more well-versed in these matters than I am, I'd be happy to hear any suggestions you have regarding quality control.

Thanks again,

Vonn

On Fri, Sep 3, 2010 at 10:47 AM, Vonn <vwal...@email.unc.edu> wrote:
Hi All,

I'm interested in doing some basic quality control with Affy SNP6.0
chips.  In particular, I'd like to get summary statistics and plot
densities of the intensity values for each array both before and after
normalization (allelic crosstalk calibration and base position
normalization).  Plotting the densities pre/post-normalization is no
problem.  Moreover, I can use extractAffyBatch() to get the intensity
values and summary statistics before normalizing.  However, I get an
error about a corrupted CEL file when I try to use extractAffyBatch()
post-normalization.

Is there a way to extract the intensity values without using
extractAffyBatch()?

Actually, it does not make sense to try to extract an AffyBatch from a
ChipEffectSet.  The reason is that an AffyBatch should hold
probe-level signals, while a ChipEffectSet contains summarized
signals.  In the next release of aroma.affymetrix, extractAffyBatch()
will throw a more informative error message explaining this.

I'd like to ask you what do you want to do in the next step, because
depending on what you want to do, there are different ways of
extract/working with/exporting the summarized signals.

For instance, you may want to use the extractDataFrame() method, cf.

 http://www.aroma-project.org/howtos/extractDataFrame

Notes: This will load all requested data into memory, so you probably
want to subset by units and/or samples when using that method.  Also,
the first time you use this method for a new chip type, it will take
some time, because it first has to do some formatting of the
annotation data.  Just be patient; you can use verbose=-50 to see
what's going on.

See also extractTheta(), cf.

 http://www.aroma-project.org/howtos/extractTheta

If you want to export your data to tab-delimited files, there is also
writeDataFrame(), cf.

 http:/www.aroma-project.org/howtos/writeDataFrame

There are more options, but I leave it there until I know what you are
really after.

Hope this helps

Henrik


My code, traceback, and sessionInfo are below.

Thanks in advance,

Vonn

#Load aroma
library(aroma.affymetrix)
Loading required package: R.utils
Loading required package: R.oo
Loading required package: R.methodsS3
R.methodsS3 v1.2.0 (2010-03-13) successfully loaded. See ?R.methodsS3
for help.
R.oo v1.7.3 (2010-06-04) successfully loaded. See ?R.oo for help.
R.utils v1.5.0 (2010-08-04) successfully loaded. See ?R.utils for
help.
Loading required package: R.filesets
Loading required package: digest
R.filesets v0.8.3 (2010-07-06) successfully loaded. See ?R.filesets
for help.
Loading required package: aroma.core
Loading required package: R.cache
R.cache v0.3.0 (2010-03-13) successfully loaded. See ?R.cache for
help.
Loading required package: R.rsp
R.rsp v0.3.6 (2009-09-16) successfully loaded. See ?R.rsp for help.
 Type browseRsp() to open the RSP main menu in your browser.
Loading required package: matrixStats
matrixStats v0.2.1 (2010-04-05) successfully loaded. See ?matrixStats
for help.
Loading required package: aroma.light
aroma.light v1.16.1 (2010-06-23) successfully loaded. See ?aroma.light
for help.
aroma.core v1.7.0 (2010-07-26) successfully loaded. See ?aroma.core
for help.
Loading required package: aroma.apd
Loading required package: R.huge
R.huge v0.2.0 (2009-10-16) successfully loaded. See ?R.huge for help.
Loading required package: affxparser
aroma.apd v0.1.7 (2009-10-16) successfully loaded. See ?aroma.apd for
help.
aroma.affymetrix v1.7.0 (2010-07-26) successfully loaded. See ?
aroma.affymetrix for help.
log = verbose = Arguments$getVerbose(-8, timestamp = TRUE)
options(digits = 4)

#Test to make sure things are working
cdf = AffymetrixCdfFile$byChipType("GenomeWideSNP_6", tags = "Full")
print(cdf)
AffymetrixCdfFile:
Path: annotationData/chipTypes/GenomeWideSNP_6
Filename: GenomeWideSNP_6,Full.cdf
Filesize: 470.44MB
Chip type: GenomeWideSNP_6,Full
RAM: 0.00MB
File format: v4 (binary; XDA)
Dimension: 2572x2680
Number of cells: 6892960
Number of units: 1881415
Cells per unit: 3.66
Number of QC units: 4

gi = getGenomeInformation(cdf)
print(gi)
UgpGenomeInformation:
Name: GenomeWideSNP_6
Tags: Full,na30,hg18,HB20100215
Full name: GenomeWideSNP_6,Full,na30,hg18,HB20100215
Pathname: annotationData/chipTypes/GenomeWideSNP_6/
GenomeWideSNP_6,Full,na30,hg18,HB20100215.ugp
File size: 8.97 MB (9407867 bytes)
RAM: 0.00 MB
Chip type: GenomeWideSNP_6,Full

si = getSnpInformation(cdf)
print(si)
UflSnpInformation:
Name: GenomeWideSNP_6
Tags: Full,na30,hg18,HB20100215
Full name: GenomeWideSNP_6,Full,na30,hg18,HB20100215
Pathname: annotationData/chipTypes/GenomeWideSNP_6/
GenomeWideSNP_6,Full,na30,hg18,HB20100215.ufl
File size: 7.18 MB (7526452 bytes)
RAM: 0.00 MB
Chip type: GenomeWideSNP_6,Full
Number of enzymes: 2

acs = AromaCellSequenceFile$byChipType(getChipType(cdf, fullname = FALSE))
print(acs)
AromaCellSequenceFile:
Name: GenomeWideSNP_6
Tags: HB20080710
Full name: GenomeWideSNP_6,HB20080710
Pathname: annotationData/chipTypes/GenomeWideSNP_6/
GenomeWideSNP_6,HB20080710.acs
File size: 170.92 MB (179217531 bytes)
RAM: 0.00 MB
Number of data rows: 6892960
File format: v1
Dimensions: 6892960x26
Column classes: raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw,
raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw,
raw
Number of bytes per column: 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
Footer: <createdOn>20080710 22:47:02 PDT</
createdOn><platform>Affymetrix</platform><chipType>GenomeWideSNP_6</
chipType><srcFile><filename>GenomeWideSNP_6.probe_tab</
filename><filesize>341479928</
filesize><checksum>2037c033c09fd8f7c06bd042a77aef15</checksum></
srcFile><srcFile2><filename>GenomeWideSNP_6.CN_probe_tab</
filename><filesize>96968290</
filesize><checksum>3dc2d3178f5eafdbea9c8b6eca88a89c</checksum></
srcFile2>
Chip type: GenomeWideSNP_6
Platform: Affymetrix

#Load CEL files in the folder "testing."  Note that inside the working directory we have the #path rawData//testing//GenomeWideSNP_6, which contains the appropriate CEL files.
csR = AffymetrixCelSet$byName("smalltesting1", cdf = cdf)
print(csR)
AffymetrixCelSet:
Name: smalltesting1
Tags:
Path: rawData/smalltesting1/GenomeWideSNP_6
Platform: Affymetrix
Chip type: GenomeWideSNP_6,Full
Number of arrays: 9
Names: sample1, sample10, ..., sample9
Time period: 2010-04-20 13:23:11 -- 2010-04-20 18:43:03
Total file size: 592.90MB
RAM: 0.01MB
#Create AffyBatch object for csR
ab = extractAffyBatch(csR)
Loading required package: affy
Loading required package: Biobase

Welcome to Bioconductor

 Vignettes contain introductory material. To view, type
 'openVignette()'. To cite Bioconductor, see
 'citation("Biobase")' and for packages 'citation(pkgname)'.


Attaching package: 'Biobase'

The following object(s) are masked from 'package:matrixStats':

   anyMissing, rowMedians


Attaching package: 'affy'

The following object(s) are masked from 'package:aroma.light':

   plotDensity


       The following object(s) are masked _by_ package:Biobase :

        .Depends


       The following object(s) are masked _by_
package:aroma.affymetrix :

        .Depends


       The following object(s) are masked _by_ package:aroma.apd :

        .Depends


       The following object(s) are masked _by_ package:R.huge :

        .Depends


       The following object(s) are masked _by_ package:aroma.core :

        .Depends


       The following object(s) are masked _by_ package:aroma.light :

        .Depends plotDensity


       The following object(s) are masked _by_ package:matrixStats :

        .Depends


       The following object(s) are masked _by_ package:R.rsp :

        .Depends


       The following object(s) are masked _by_ package:R.cache :

        .Depends


       The following object(s) are masked _by_ package:R.filesets :

        .Depends


       The following object(s) are masked _by_ package:R.utils :

        .Depends


       The following object(s) are masked _by_ package:R.oo :

        .Depends

Loading required package: genomewidesnp6,fullcdf
Warning messages:
1: In fcn(...) : Packages reordered in search path: package:affy
2: In extractAffyBatch.AffymetrixCelSet(csR) :
 CDF enviroment package 'genomewidesnp6,fullcdf' not installed. The
'affy' package will later try to download it from Bioconductor and
install it.

#########################
#########################
#####Now for normalization.  As per CMRAv2.0, we perform allelic
#####crosstalk calibration and normalization for nucleotide-position
#####probe sequence effects.
#########################
#########################
acc = AllelicCrosstalkCalibration(csR, model = "CRMAv2")
csC = process(acc, verbose = verbose)
20100903 13:35:10|Calibrating data set for allelic cross talk...
20100903 13:35:11| Already calibrated
20100903 13:35:11|Calibrating data set for allelic cross talk...done
bpn = BasePositionNormalization(csC, target = "zero")
csN = process(bpn, verbose = verbose)
20100903 13:35:11|Normalization data set for probe-sequence effects...
20100903 13:35:11| Already normalized
20100903 13:35:11|Normalization data set for probe-sequence
effects...done

#Now repeat the above QC steps based on the normalized data
abN = extractAffyBatch(csN)
Loading required package: genomewidesnp6,fullcdf
Error in read.affybatch(filenames = l$filenames, phenoData = l
$phenoData,  :
 It appears that the file probeData/smalltesting1,ACC,ra,-XY,BPN,-XY/
GenomeWideSNP_6/sample1.CEL is corrupted.
In addition: Warning message:
In extractAffyBatch.AffymetrixCelSet(csN) :
 CDF enviroment package 'genomewidesnp6,fullcdf' not installed. The
'affy' package will later try to download it from Bioconductor and
install it.

traceback()
5: .Call("read_abatch", filenames, rm.mask, rm.outliers, rm.extra,
      ref.cdfName, dim.intensity, verbose, PACKAGE = "affyio")
4: read.affybatch(filenames = l$filenames, phenoData = l$phenoData,
      description = l$description, notes = notes, compress =
compress,
      rm.mask = rm.mask, rm.outliers = rm.outliers, rm.extra =
rm.extra,
      verbose = verbose, sd = sd, cdfname = cdfname)
3: ReadAffy(filenames = filenames, sampleNames = sampleNames, ...,
      verbose = as.logical(verbose))
2: extractAffyBatch.AffymetrixCelSet(csN)
1: extractAffyBatch(csN)
sessionInfo()
R version 2.11.1 (2010-05-31)
i386-pc-mingw32

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods
base

other attached packages:
 [1] Biobase_2.6.1          aroma.affymetrix_1.7.0
aroma.apd_0.1.7
 [4] affxparser_1.18.0      R.huge_0.2.0
aroma.core_1.7.0
 [7] aroma.light_1.16.1     matrixStats_0.2.1
R.rsp_0.3.6
[10] R.cache_0.3.0          R.filesets_0.8.3
digest_0.4.2
[13] R.utils_1.5.0          R.oo_1.7.3
affy_1.24.2
[16] R.methodsS3_1.2.0

loaded via a namespace (and not attached):
[1] affyio_1.14.0        preprocessCore_1.8.0 tools_2.11.1


--
When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group with website http://www.aroma-project.org/.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe and other options, go to http://www.aroma-project.org/forum/


--
When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group with website http://www.aroma-project.org/.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe and other options, go to http://www.aroma-project.org/forum/




--
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group with website http://www.aroma-project.org/.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe and other options, go to http://www.aroma-project.org/forum/

Reply via email to