Re: Reference for allele crosstalk calibration method? (Was: Re: [aroma.affymetrix] Re: Error in ACC using custom cdf)

Henrik Bengtsson Tue, 10 Feb 2009 19:51:17 -0800

Hi.

On Tue, Feb 3, 2009 at 10:39 PM, David Rosenberg
<david.m.rosenb...@gmail.com> wrote:
>
> Thank you greatly for your help.  A couple of additional questions:


[snip]

> 2.)     I don't fully follow the acc model described in the CRMA paper
> (2008).  The section in question references a 2001 Wirapati P, Speed
> T. paper that is listed as 'draft.'  Can you recommend another
> reference / example of how the acc transformation is performed and the
> crosstalk matrix/offset vector are calculated?

There is also:

Wirapati P, Speed TP (2002) An algorithm to fit a simplex to a set of
multidimensional points. WEHI Bioinformatics technical notes.
http://www.isrec.isb-sib.ch/~pwirapat/sfit/wirapati2002wehi-bioinf.pdf

It has been on my and Asa's (P. Wirapati) todo list to write up and
submit a manuscript on this and more for a while.  Last time we had a
very serious chat and start on this was in May 2008, but more
high-priority requests keep postponing this.  I think it is very
unfortunate, because Asa's work on this over the years (he should get
most of the credit) is quite amazing; he's careful about theory,
estimators, algorithm, implementation, speed and everything you can
imagine.  This project has achieved a bit higher priority from related
work on resequencing arrays.  We are also testing out more generic
weighted estimators taking prior estimates.  ...one day.

/Henrik

>
> Thanks again.
>
> On Feb 3, 2009, at 3:23 PM, Henrik Bengtsson wrote:
>
>>
>> Ok, sorry I didn't answer you question explicitly.  No, they do not
>> have to contain ChrX and ChrY data.  However, but independent of your
>> question, I would recommend that you import all annotation data
>> available, since that might become of interest in future usage.
>>
>> Currently part of the code is hardwired to assume the human genome.
>> Thus, if you pass "-XY", it interprets "X" to correspond to chromosome
>> 23 and "Y" to be chromosome 24.  So, '-XY' tries to exclude units on
>> chromosome 23 and 24.  The plan is to support genome specific
>> annotation data where you can specify what chromosome index "X" and
>> "Y" maps to.  The directory annotationData/genomes/ is reserved for
>> this, and the ChromosomeExplorer is somewhat sensitive to this.  But
>> that is all future plans.
>>
>> /Henrik
>>
>> On Tue, Feb 3, 2009 at 1:03 PM, David Rosenberg
>> <david.m.rosenb...@gmail.com> wrote:
>>>
>>> Do they ufl/ugp/acc files need to be build with X and Y chromosomes
>>> included?  If so, this won't be too hard to fix.
>>>
>>>
>>> On Jan 30, 9:50 pm, David Rosenberg <david.m.rosenb...@gmail.com>
>>> wrote:
>>>> As I think about this further, it occurs to me that there are other
>>>> potential problems with this chip/cdf.  I was looking at the ugp/ufl
>>>> files and it appears that fragment length normalization etc. is
>>>> performed on a unit-by-unit basis.  The way the array/cdf is
>>>> currently
>>>> structured, not all probes in a particular unit are precisely co-
>>>> located.  While the location differences within a unit are quite
>>>> small
>>>> (100 bp or so), this does result in units where probes hybridize to
>>>> multiple digestion fragments.  This definitely 'breaks' fragment
>>>> length normalization as I see it currently implemented.  Now, the
>>>> cdf
>>>> can be restructured such that all units map to a single genomic
>>>> location, but that seems to preclude merging/summarizing further
>>>> down
>>>> the analysis workflow.  If there were a way to perform these
>>>> normalization procedures on a per-probe basis rather than a per-unit
>>>> basis, this would be preferable.  Let me know your thoughts.
>>>>
>>>> On Jan 30, 2009, at 6:41 PM, Henrik Bengtsson wrote:
>>>>
>>>>
>>>>
>>>>> Hi,
>>>>
>>>>> could you please forward your UGP file to me; I think I know what
>>>>> the
>>>>> problem is, but I guess it easier for me to check it myself first.
>>>>
>>>>> BTW, although this looks like a custom CDF - if you want to, I
>>>>> can put
>>>>> up a group page specific to this chip type, documenting the chip
>>>>> type
>>>>> (and either link or host those annotation files). Might be useful
>>>>> for
>>>>> a future fellow researcher.  It's your call.
>>>>
>>>>> /Henrik
>>>>
>>>>> On Fri, Jan 30, 2009 at 12:08 PM, David Rosenberg
>>>>> <david.m.rosenb...@gmail.com> wrote:
>>>>
>>>>>> I am receiving the following errors when attempting to perform
>>>>>> allelic
>>>>>> crosstalk calibration on a dataset using a custom CDF.  I don't
>>>>>> know
>>>>>> if this is indicative of errors in the internal structure of the
>>>>>> CDF,
>>>>>> or if there are parameters that I must pass to
>>>>>> AllelicCrosstalkCalibration due to the properties of the CDF
>>>>>> (i.e. #
>>>>>> of chromosomes, etc.)
>>>>
>>>>>>> library("aroma.affymetrix")
>>>>>> Loading required package: R.utils
>>>>>> Loading required package: R.oo
>>>>>> Loading required package: R.methodsS3
>>>>>> R.methodsS3 v1.0.3 (2008-07-02) successfully loaded. See ?
>>>>>> R.methodsS3
>>>>>> for help.
>>>>>> R.oo v1.4.6 (2008-08-11) successfully loaded. See ?R.oo for help.
>>>>>> R.utils v1.1.3 (2009-01-12) successfully loaded. See ?R.utils for
>>>>>> help.
>>>>>> Loading required package: aroma.core
>>>>>> Loading required package: R.cache
>>>>>> R.cache v0.1.7 (2008-02-27) successfully loaded. See ?R.cache for
>>>>>> help.
>>>>>> Loading required package: R.rsp
>>>>>> R.rsp v0.3.4 (2008-03-06) successfully loaded. See ?R.rsp for
>>>>>> help.
>>>>>> Type browseRsp() to open the RSP main menu in your browser.
>>>>>> Loading required package: matrixStats
>>>>>> Loading required package: digest
>>>>>> Loading required package: aroma.light
>>>>>> aroma.light v1.11.1 (2009-01-12) successfully loaded. See ?
>>>>>> aroma.light
>>>>>> for help.
>>>>>> aroma.core v1.0.0 (2009-01-12) successfully loaded. See ?
>>>>>> aroma.core
>>>>>> for help.
>>>>>> Loading required package: affxparser
>>>>>> Loading required package: R.huge
>>>>>> R.huge v0.1.6 (2008-07-03) successfully loaded. See ?R.huge for
>>>>>> help.
>>>>>> Loading required package: aroma.apd
>>>>>> aroma.apd v0.1.3 (2006-06-14) successfully loaded. See ?
>>>>>> aroma.apd for
>>>>>> help.
>>>>>> aroma.affymetrix v1.0.0 (2009-01-12) successfully loaded. See ?
>>>>>> aroma.affymetrix for help.
>>>>
>>>>>>> log <- verbose <- Arguments$getVerbose(-8, timestamp=TRUE)
>>>>
>>>>>>> # Don't display too many decimals.
>>>>>>> options(digits=4)
>>>>>>> chipType="MOUSEDIVm520650"
>>>>>>> cdf<-AffymetrixCdfFile$byChipType("MOUSEDIVm520650")
>>>>>>> print(cdf)
>>>>>> AffymetrixCdfFile:
>>>>>> Path: annotationData/chipTypes/MOUSEDIVm520650
>>>>>> Filename: MOUSEDIVm520650.CDF
>>>>>> Filesize: 463.91MB
>>>>>> Chip type: MOUSEDIVm520650
>>>>>> RAM: 0.00MB
>>>>>> File format: v4 (binary; XDA)
>>>>>> Dimension: 2572x2680
>>>>>> Number of cells: 6892960
>>>>>> Number of units: 973990
>>>>>> Cells per unit: 7.08
>>>>>> Number of QC units: 4
>>>>>>> gi<-getGenomeInformation(cdf)
>>>>>>> print(gi)
>>>>>> UgpGenomeInformation:
>>>>>> Name: MOUSEDIVm520650
>>>>>> Tags: DMR20090129
>>>>>> Pathname: annotationData/chipTypes/MOUSEDIVm520650/
>>>>>> MOUSEDIVm520650,DMR20090129.ugp
>>>>>> File size: 4.64MB
>>>>>> RAM: 0.00MB
>>>>>> Chip type: MOUSEDIVm520650
>>>>>>> si<-getSnpInformation(cdf)
>>>>>>> print(si)
>>>>>> UflSnpInformation:
>>>>>> Name: MOUSEDIVm520650
>>>>>> Tags: DMR20090129
>>>>>> Pathname: annotationData/chipTypes/MOUSEDIVm520650/
>>>>>> MOUSEDIVm520650,DMR20090129.ufl
>>>>>> File size: 3.72MB
>>>>>> RAM: 0.00MB
>>>>>> Chip type: MOUSEDIVm520650
>>>>>> Number of enzymes: 2
>>>>>>> acs<-AromaCellSequenceFile$byChipType(getChipType(cdf,
>>>>>>> fullname=FALSE))
>>>>>>> print(acs)
>>>>>> AromaCellSequenceFile:
>>>>>> Name: MOUSEDIVm520650
>>>>>> Tags: DMR20090129
>>>>>> Pathname: annotationData/chipTypes/MOUSEDIVm520650/
>>>>>> MOUSEDIVm520650,DMR20090129.acs
>>>>>> File size: 170.91MB
>>>>>> RAM: 0.00MB
>>>>>> Number of data rows: 6892960
>>>>>> File format: v1
>>>>>> Dimensions: 6892960x26
>>>>>> Column classes: raw, raw, raw, raw, raw, raw, raw, raw, raw, raw,
>>>>>> raw,
>>>>>> raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw, raw,
>>>>>> raw,
>>>>>> raw
>>>>>> Number of bytes per column: 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
>>>>>> 1, 1,
>>>>>> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
>>>>>> Footer: <createdOn>20090129 12:45:11 CST</
>>>>>> createdOn><platform>Affymetrix</
>>>>>> platform><chipType>MOUSEDIVm520650</
>>>>>> chipType>
>>>>>> Chip type: MOUSEDIVm520650
>>>>>> Platform: Affymetrix
>>>>>>> csR<-AffymetrixCelSet$byName("mDIV/testset", cdf=cdf)
>>>>>>> print(csR)
>>>>>> AffymetrixCelSet:
>>>>>> Name: testset
>>>>>> Tags:
>>>>>> Path: rawData/mDIV/testset/MOUSEDIVm520650
>>>>>> Platform: Affymetrix
>>>>>> Chip type: MOUSEDIVm520650
>>>>>> Number of arrays: 44
>>>>>> Names: SNP_mDIV_A10-10_081308, SNP_mDIV_A10-201_091708, ...,
>>>>>> SNP_mDIV_A9-9_081308
>>>>>> Time period: 2008-08-13 15:39:47 -- 2008-09-18 00:20:33
>>>>>> Total file size: 2899.65MB
>>>>>> RAM: 0.06MB
>>>>>> There were 50 or more warnings (use warnings() to see the first
>>>>>> 50)
>>>>>>> acc<-AllelicCrosstalkCalibration(csR, model="CRMAv2")
>>>>>>> print(acc)
>>>>>> AllelicCrosstalkCalibration:
>>>>>> Data set: testset
>>>>>> Input tags:
>>>>>> User tags: *
>>>>>> Asterisk ('*') tags: ACC,ra,-XY
>>>>>> Output tags: ACC,ra,-XY
>>>>>> Number of files: 44 (2899.65MB)
>>>>>> Platform: Affymetrix
>>>>>> Chip type: MOUSEDIVm520650
>>>>>> Algorithm parameters: (rescaleBy: chr "all", targetAvg: num 2200,
>>>>>> subsetToAvg: chr "-XY", mergeShifts: logi TRUE, B: int 1,
>>>>>> flavor: chr
>>>>>> "sfit", algorithmParameters:List of 3, ..$ alpha: num [1:8] 0.1
>>>>>> 0.075
>>>>>> 0.05 0.03 0.01 0.0025 0.001 0.0001, ..$ q: num 2, ..$ Q: num 98)
>>>>>> Output path: probeData/testset,ACC,ra,-XY/MOUSEDIVm520650
>>>>>> Is done: FALSE
>>>>>> RAM: 0.01MB
>>>>>>> csC<-process(acc, verbose=verbose)
>>>>>> 20090130 14:03:43|Calibrating data set for allelic cross talk...
>>>>>> Error in if (any(units < 1)) stop("Argument 'units' contains non-
>>>>>> positive indices.") :
>>>>>> missing value where TRUE/FALSE needed
>>>>>> 20090130 14:03:43|Calibrating data set for allelic cross
>>>>>> talk...done
>>>>>>> traceback()
>>>>>> 13: readCdfCellIndices(pathname, ...)
>>>>>> 12: getCellIndicesChunk(getPathname(this), units =
>>>>>> unitsChunk, ...,
>>>>>>      verbose = verbose)
>>>>>> 11: fcn(idxs[ii], ...)
>>>>>> 10: lapplyInChunks.numeric(units, function(unitsChunk) {
>>>>>>      cdfChunk <- getCellIndicesChunk(getPathname(this), units =
>>>>>> unitsChunk,
>>>>>>          ..., verbose = verbose)
>>>>>>      res <- vector("list", length(unitsChunk))
>>>>>>      res[[1]] <- unlist(cdfChunk, use.names = useNames)
>>>>>>      res
>>>>>>  }, chunkSize = 1e+05, useNames = useNames, verbose = verbose)
>>>>>> 9: lapplyInChunks(units, function(unitsChunk) {
>>>>>>     cdfChunk <- getCellIndicesChunk(getPathname(this), units =
>>>>>> unitsChunk,
>>>>>>         ..., verbose = verbose)
>>>>>>     res <- vector("list", length(unitsChunk))
>>>>>>     res[[1]] <- unlist(cdfChunk, use.names = useNames)
>>>>>>     res
>>>>>> }, chunkSize = 1e+05, useNames = useNames, verbose = verbose)
>>>>>> 8: getCellIndices.AffymetrixCdfFile(cdf, units = subset,
>>>>>> useNames =
>>>>>> FALSE,
>>>>>>     unlist = TRUE)
>>>>>> 7: getCellIndices(cdf, units = subset, useNames = FALSE, unlist =
>>>>>> TRUE)
>>>>>> 6: getSubsetToAvg.AllelicCrosstalkCalibration(this)
>>>>>> 5: getSubsetToAvg(this)
>>>>>> 4: getParameters.AllelicCrosstalkCalibration(this)
>>>>>> 3: getParameters(this)
>>>>>> 2: process.AllelicCrosstalkCalibration(acc, verbose = verbose)
>>>>>> 1: process(acc, verbose = verbose)
>>>>
>>>
>>
>> >
>
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe from this group, send email to 
aroma-affymetrix-unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/aroma-affymetrix?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: Reference for allele crosstalk calibration method? (Was: Re: [aroma.affymetrix] Re: Error in ACC using custom cdf)

Reply via email to