Hello, I know this has been addressed before, but I am still not clear about it and not sure whether it was solved. I am having trouble getting gcRMA to work with a Human Transcriptome Array 2.0. I created custom CDFs based on probes mapped to 5'utr only. Please find my R code, output/error. sessionInfo below. Could you help me with the following questions, I would really appreciate:
1. Regarding probe_tab file required for GcRmaBackgroundCorrection, does it have to have specific column names? When I change first column name from Probe Set Name to Probe SetName I am getting completely different error (bit more reminiscent of error I read about in other post on this group from 2010 regarding GcRmaBackgroundCorrection). Error: Either argument 'names' or 'pattern' must be specified. In addition: Warning message: In readDataFrame.TabularTextFile(ptf, colClasses = c(`^(unitName|probeSetID)$` = "character"), : Argument 'rows' was out of range [1,0]. Ignored rows beyond this range. 2. As I understand from your previous posts, the solution would be to download affymetrix cdf for background correction purposes. Could you explain me why it is needed and can't be done using customCDF? Also if I would follow that approach, how it would affect if only small subset of probes is used in the analysis. Thank you Best wishes Krzysztof This is my code: # Read all the cel files that are there in your folder cdf <- AffymetrixCdfFile$byChipType(chipType, tags='binary') cs <- AffymetrixCelSet$byName(name, cdf=cdf, verbose=verbose) # Read a file with cel files of experiment design - it deals with >CEL than currently being used mt <- match(design[,1], getNames(cs)) ds <- extract(cs, mt) setCdf(ds, cdf) # Background correction bc <- GcRmaBackgroundCorrection(ds,type="affinities") csBC <- process(bc, verbose=-20) #Here is output: Background correcting data set... Background correcting data set... Already background corrected for "optical" effects Background correcting data set...done Computing probe affinities (independent of data)... Computing GCRMA probe affinities... Chip type: HTA2_hg19_refseq_fiveutr Number of units: 33640 Locating the cell sequence annotation data file... Locating the cell sequence annotation data file...done Computing GCRMA probe affinities...done Computing GCRMA probe affinities... Number of units: 33640 Identify PMs and MMs among the CDF cell indices... logi [1:822842] TRUE TRUE TRUE TRUE TRUE TRUE ... Mode TRUE NA's logical 822842 0 MMs are defined as non-PMs Number of PMs: 822842 Number of MMs: 0 Identify PMs and MMs among the CDF cell indices...done Reading probe-sequence data... Retrieving probe-sequence data... Chip type (full): HTA2_hg19_refseq_fiveutr,binary Locating probe-tab file... Chip type: HTA2_hg19_refseq_fiveutr AffymetrixProbeTabFile: Name: HTA2_hg19_refseq_fiveutr Tags: Full name: HTA2_hg19_refseq_fiveutr Pathname: annotationData/chipTypes/HTA2_hg19_refseq_fiveutr/HTA2_hg19_refseq_fiveutr_probe_tab File size: 43.03 MiB (45125121 bytes) RAM: 0.01 MB Number of data rows: NA Columns [6]: 'unitName', 'probeXPos', 'probeYPos', 'interrogationPosition', 'probeSequence', 'targetStrandedness' Number of text lines: NA AffymetrixCdfFile: Path: annotationData/chipTypes/HTA2_hg19_refseq_fiveutr Filename: HTA2_hg19_refseq_fiveutr.cdf File size: 16.44 MiB (17238612 bytes) Chip type: HTA2_hg19_refseq_fiveutr RAM: 0.00MB File format: v4 (binary; XDA) Dimension: 2572x2680 Number of cells: 6892960 Number of units: 33640 Cells per unit: 204.90 Number of QC units: 0 Locating probe-tab file...done Validating probe-tab file against CDF... Number of records read: 1 Data read: 'data.frame': 1 obs. of 1 variable: $ unitName: chr "NM_000015" Unit name: chr "NM_000015" Unit index: 2 probeXPos probeYPos probeSequence 1 2493 329 CTTCCCTTGCAGACTTTGGAAGGGA (x,y): [1] 2493 329 Validating probe-tab file against CDF...done Reading (x,y,sequence) data... Reading (x,y,sequence) data...done Validating (x,y) against CDF dimension... CDF dimension: nbrOfRows nbrOfColumns 2572 2680 [2016-03-31 10:22:42] Exception: Detected probe x position out of range [0,2572]: annotationData/chipTypes/HTA2_hg19_refseq_fiveutr/HTA2_hg19_refseq_fiveutr_probe_tab at #08. getProbeSequenceData.AffymetrixCdfFile(this, safe = safe, verbose = verbose) - getProbeSequenceData.AffymetrixCdfFile() is in environment 'aroma.affymetrix' at #07. getProbeSequenceData(this, safe = safe, verbose = verbose) - getProbeSequenceData() is in environment 'aroma.affymetrix' at #06. computeAffinities.AffymetrixCdfFile(cdf, ..., verbose = less(verbose)) - computeAffinities.AffymetrixCdfFile() is in environment 'aroma.affymetrix' at #05. computeAffinities(cdf, ..., verbose = less(verbose)) - computeAffinities() is in environment 'aroma.affymetrix' at #04. calculateAffinities.GcRmaBackgroundCorrection(this, verbose = less(verbose)) - calculateAffinities.GcRmaBackgroundCorrection() is in environment 'aroma.affymetrix' at #03. calculateAffinities(this, verbose = less(verbose)) - calculateAffinities() is in environment 'aroma.affymetrix' at #02. process.GcRmaBackgroundCorrection(bc, verbose = -20) - process.GcRmaBackgroundCorrection() is in environment 'aroma.affymetrix' at #01. process(bc, verbose = -20) - process() is in environment 'aroma.core' Error: Detected probe x position out of range [0,2572]: annotationData/chipTypes/HTA2_hg19_refseq_fiveutr/HTA2_hg19_refseq_fiveutr_probe_tab Validating (x,y) against CDF dimension...done Retrieving probe-sequence data...done Reading probe-sequence data...done Computing GCRMA probe affinities...done Computing probe affinities (independent of data)...done Background correcting data set...done #Session info R version 3.2.3 (2015-12-10) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: OS X 10.11.1 (El Capitan) locale: [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8 attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base other attached packages: [1] GO.db_3.2.2 org.Hs.eg.db_3.2.3 RSQLite_1.0.0 DBI_0.3.1 AnnotationDbi_1.32.3 [6] Biobase_2.30.0 Sushi_1.8.0 moments_0.14 entropy_1.2.1 zoo_1.7-12 [11] biomaRt_2.26.1 reshape2_1.4.1 rtracklayer_1.30.3 GenomicRanges_1.22.4 GenomeInfoDb_1.6.3 [16] IRanges_2.4.8 S4Vectors_0.8.11 BiocGenerics_0.16.1 FIRMAGene_0.9.8 dplyr_0.4.3 [21] stringr_1.0.0 data.table_1.9.6 aroma.light_3.0.0 aroma.affymetrix_3.0.0 aroma.core_3.0.0 [26] R.devices_2.14.0 R.filesets_2.10.0 R.utils_2.2.0 R.oo_1.20.0 affxparser_1.42.0 [31] R.methodsS3_1.7.1 loaded via a namespace (and not attached): [1] Rcpp_0.12.3 lattice_0.20-33 listenv_0.6.0 Rsamtools_1.22.0 [5] Biostrings_2.38.4 assertthat_0.1 digest_0.6.9 R6_2.1.2 [9] plyr_1.8.3 chron_2.3-47 futile.options_1.0.0 R.huge_0.9.0 [13] BiocInstaller_1.20.1 zlibbioc_1.16.0 preprocessCore_1.32.0 splines_3.2.3 [17] BiocParallel_1.4.3 gcrma_2.42.0 RCurl_1.95-4.8 base64enc_0.1-3 [21] aroma.apd_0.6.0 R.rsp_0.21.0 globals_0.6.1 SummarizedExperiment_1.0.2 [25] DNAcopy_1.44.0 codetools_0.2-14 matrixStats_0.50.1 XML_3.98-1.4 [29] future_0.12.0 GenomicAlignments_1.6.3 bitops_1.0-6 grid_3.2.3 [33] affy_1.48.0 magrittr_1.5 stringi_1.0-1 XVector_0.10.0 [37] affyio_1.40.0 PSCBS_0.61.0 futile.logger_1.4.1 lambda.r_1.1.7 [41] tools_3.2.3 R.cache_0.12.0 -- -- When reporting problems on aroma.affymetrix, make sure 1) to run the latest version of the package, 2) to report the output of sessionInfo() and traceback(), and 3) to post a complete code example. You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group with website http://www.aroma-project.org/. To post to this group, send email to aroma-affymetrix@googlegroups.com To unsubscribe and other options, go to http://www.aroma-project.org/forum/ --- You received this message because you are subscribed to the Google Groups "aroma.affymetrix" group. To unsubscribe from this group and stop receiving emails from it, send an email to aroma-affymetrix+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.