Hi All,

I'm interested in doing some basic quality control with Affy SNP6.0
chips.  In particular, I'd like to get summary statistics and plot
densities of the intensity values for each array both before and after
normalization (allelic crosstalk calibration and base position
normalization).  Plotting the densities pre/post-normalization is no
problem.  Moreover, I can use extractAffyBatch() to get the intensity
values and summary statistics before normalizing.  However, I get an
error about a corrupted CEL file when I try to use extractAffyBatch()

Is there a way to extract the intensity values without using

My code, traceback, and sessionInfo are below.

Thanks in advance,


> #Load aroma
> library(aroma.affymetrix)
Loading required package: R.utils
Loading required package: R.oo
Loading required package: R.methodsS3
R.methodsS3 v1.2.0 (2010-03-13) successfully loaded. See ?R.methodsS3
for help.
R.oo v1.7.3 (2010-06-04) successfully loaded. See ?R.oo for help.
R.utils v1.5.0 (2010-08-04) successfully loaded. See ?R.utils for
Loading required package: R.filesets
Loading required package: digest
R.filesets v0.8.3 (2010-07-06) successfully loaded. See ?R.filesets
for help.
Loading required package: aroma.core
Loading required package: R.cache
R.cache v0.3.0 (2010-03-13) successfully loaded. See ?R.cache for
Loading required package: R.rsp
R.rsp v0.3.6 (2009-09-16) successfully loaded. See ?R.rsp for help.
 Type browseRsp() to open the RSP main menu in your browser.
Loading required package: matrixStats
matrixStats v0.2.1 (2010-04-05) successfully loaded. See ?matrixStats
for help.
Loading required package: aroma.light
aroma.light v1.16.1 (2010-06-23) successfully loaded. See ?aroma.light
for help.
aroma.core v1.7.0 (2010-07-26) successfully loaded. See ?aroma.core
for help.
Loading required package: aroma.apd
Loading required package: R.huge
R.huge v0.2.0 (2009-10-16) successfully loaded. See ?R.huge for help.
Loading required package: affxparser
aroma.apd v0.1.7 (2009-10-16) successfully loaded. See ?aroma.apd for
aroma.affymetrix v1.7.0 (2010-07-26) successfully loaded. See ?
aroma.affymetrix for help.
> log = verbose = Arguments$getVerbose(-8, timestamp = TRUE)
> options(digits = 4)
> #Test to make sure things are working
> cdf = AffymetrixCdfFile$byChipType("GenomeWideSNP_6", tags = "Full")
> print(cdf)
Path: annotationData/chipTypes/GenomeWideSNP_6
Filename: GenomeWideSNP_6,Full.cdf
Filesize: 470.44MB
Chip type: GenomeWideSNP_6,Full
RAM: 0.00MB
File format: v4 (binary; XDA)
Dimension: 2572x2680
Number of cells: 6892960
Number of units: 1881415
Cells per unit: 3.66
Number of QC units: 4
> gi = getGenomeInformation(cdf)
> print(gi)
Name: GenomeWideSNP_6
Tags: Full,na30,hg18,HB20100215
Full name: GenomeWideSNP_6,Full,na30,hg18,HB20100215
Pathname: annotationData/chipTypes/GenomeWideSNP_6/
File size: 8.97 MB (9407867 bytes)
RAM: 0.00 MB
Chip type: GenomeWideSNP_6,Full
> si = getSnpInformation(cdf)
> print(si)
Name: GenomeWideSNP_6
Tags: Full,na30,hg18,HB20100215
Full name: GenomeWideSNP_6,Full,na30,hg18,HB20100215
Pathname: annotationData/chipTypes/GenomeWideSNP_6/
File size: 7.18 MB (7526452 bytes)
RAM: 0.00 MB
Chip type: GenomeWideSNP_6,Full
Number of enzymes: 2
> acs = AromaCellSequenceFile$byChipType(getChipType(cdf, fullname = FALSE))
> print(acs)
Name: GenomeWideSNP_6
Tags: HB20080710
Full name: GenomeWideSNP_6,HB20080710
Pathname: annotationData/chipTypes/GenomeWideSNP_6/
File size: 170.92 MB (179217531 bytes)
RAM: 0.00 MB
Number of data rows: 6892960
File format: v1
Dimensions: 6892960x26
Column classes: raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw,
raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw,
Number of bytes per column: 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
Footer: <createdOn>20080710 22:47:02 PDT</
Chip type: GenomeWideSNP_6
Platform: Affymetrix
> #Load CEL files in the folder "testing."  Note that inside the working 
> directory we have the
> #path rawData//testing//GenomeWideSNP_6, which contains the appropriate CEL 
> files.
> csR = AffymetrixCelSet$byName("smalltesting1", cdf = cdf)
> print(csR)
Name: smalltesting1
Path: rawData/smalltesting1/GenomeWideSNP_6
Platform: Affymetrix
Chip type: GenomeWideSNP_6,Full
Number of arrays: 9
Names: sample1, sample10, ..., sample9
Time period: 2010-04-20 13:23:11 -- 2010-04-20 18:43:03
Total file size: 592.90MB
RAM: 0.01MB
> #Create AffyBatch object for csR
> ab = extractAffyBatch(csR)
Loading required package: affy
Loading required package: Biobase

Welcome to Bioconductor

  Vignettes contain introductory material. To view, type
  'openVignette()'. To cite Bioconductor, see
  'citation("Biobase")' and for packages 'citation(pkgname)'.

Attaching package: 'Biobase'

The following object(s) are masked from 'package:matrixStats':

    anyMissing, rowMedians

Attaching package: 'affy'

The following object(s) are masked from 'package:aroma.light':


        The following object(s) are masked _by_ package:Biobase :


        The following object(s) are masked _by_
package:aroma.affymetrix :


        The following object(s) are masked _by_ package:aroma.apd :


        The following object(s) are masked _by_ package:R.huge :


        The following object(s) are masked _by_ package:aroma.core :


        The following object(s) are masked _by_ package:aroma.light :

         .Depends plotDensity

        The following object(s) are masked _by_ package:matrixStats :


        The following object(s) are masked _by_ package:R.rsp :


        The following object(s) are masked _by_ package:R.cache :


        The following object(s) are masked _by_ package:R.filesets :


        The following object(s) are masked _by_ package:R.utils :


        The following object(s) are masked _by_ package:R.oo :


Loading required package: genomewidesnp6,fullcdf
Warning messages:
1: In fcn(...) : Packages reordered in search path: package:affy
2: In extractAffyBatch.AffymetrixCelSet(csR) :
  CDF enviroment package 'genomewidesnp6,fullcdf' not installed. The
'affy' package will later try to download it from Bioconductor and
install it.
> #########################
> #########################
> #####Now for normalization.  As per CMRAv2.0, we perform allelic
> #####crosstalk calibration and normalization for nucleotide-position
> #####probe sequence effects.
> #########################
> #########################
> acc = AllelicCrosstalkCalibration(csR, model = "CRMAv2")
> csC = process(acc, verbose = verbose)
20100903 13:35:10|Calibrating data set for allelic cross talk...
20100903 13:35:11| Already calibrated
20100903 13:35:11|Calibrating data set for allelic cross talk...done
> bpn = BasePositionNormalization(csC, target = "zero")
> csN = process(bpn, verbose = verbose)
20100903 13:35:11|Normalization data set for probe-sequence effects...
20100903 13:35:11| Already normalized
20100903 13:35:11|Normalization data set for probe-sequence
> #Now repeat the above QC steps based on the normalized data
> abN = extractAffyBatch(csN)
Loading required package: genomewidesnp6,fullcdf
Error in read.affybatch(filenames = l$filenames, phenoData = l
$phenoData,  :
  It appears that the file probeData/smalltesting1,ACC,ra,-XY,BPN,-XY/
GenomeWideSNP_6/sample1.CEL is corrupted.
In addition: Warning message:
In extractAffyBatch.AffymetrixCelSet(csN) :
  CDF enviroment package 'genomewidesnp6,fullcdf' not installed. The
'affy' package will later try to download it from Bioconductor and
install it.
> traceback()
5: .Call("read_abatch", filenames, rm.mask, rm.outliers, rm.extra,
       ref.cdfName, dim.intensity, verbose, PACKAGE = "affyio")
4: read.affybatch(filenames = l$filenames, phenoData = l$phenoData,
       description = l$description, notes = notes, compress =
       rm.mask = rm.mask, rm.outliers = rm.outliers, rm.extra =
       verbose = verbose, sd = sd, cdfname = cdfname)
3: ReadAffy(filenames = filenames, sampleNames = sampleNames, ...,
       verbose = as.logical(verbose))
2: extractAffyBatch.AffymetrixCelSet(csN)
1: extractAffyBatch(csN)
> sessionInfo()
R version 2.11.1 (2010-05-31)

[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods

other attached packages:
 [1] Biobase_2.6.1          aroma.affymetrix_1.7.0
 [4] affxparser_1.18.0      R.huge_0.2.0
 [7] aroma.light_1.16.1     matrixStats_0.2.1
[10] R.cache_0.3.0          R.filesets_0.8.3
[13] R.utils_1.5.0          R.oo_1.7.3
[16] R.methodsS3_1.2.0

loaded via a namespace (and not attached):
[1] affyio_1.14.0        preprocessCore_1.8.0 tools_2.11.1

