Re: [aroma.affymetrix] "Error: Failed to retrieve genome information" + paired samples

2016-05-13 Thread Henrik Bengtsson
Hi,

comments below.

On Fri, May 13, 2016 at 10:24 AM, Gaius Augustus
 wrote:
> Hello,
> I'm working on the Paired PSCBS protocol, and am running across an error.
>
> Here is my file structure
>
> -annotationData
> ---chipTypes
> -GenomeWideSNP_6
> ---GenomeWideSNP_6,Full,na33,hg19,dbSNP137,HB20140118.ufl
> ---GenomeWideSNP_6,Full,na33,hg19,dbSNP137,HB20140118.ugp
> ---GenomeWideSNP_6,Full.cdf
> ---GenomeWideSNP_6,HB20080710.acs
> ---GenomeWideSNP_6.cdf
> -rawData
> ---Samples
> -GenomeWideSNP_6
> ---LIST OF CEL FILES

This is all correct, except that "Samples" is not very descriptive
name for a data set - try to pick a better name (although it won't
affect your analysis).

So, the GenomeWideSNP_6 chip type is special in the sense that
Affymetrix provide two different CDF for it: the default one and the
"full" one.  You have both in
annotationData/chipTypes/GenomeWideSNP_6/, which is nothing strange
and that is the correct (and only) place to put them.

The problem you're having is that you didn't tell Aroma to use the
"full" CDF, so it's going with the "default" one.  This is why you're
seeing "Chip type: GenomeWideSNP_6" in:

>  AffymetrixCelSet:
>  Name: Samples
>  Tags:
>  Path: rawData/Samples/GenomeWideSNP_6
>  Platform: Affymetrix
>  Chip type: GenomeWideSNP_6
>  Number of arrays: 116
> [...]

(and not "Chip type: GenomeWideSNP_6,full").  So, later Aroma wants to
access the other type of annotation data, it cannot find anything,
because the ones you have are only the ones for the "full" CDF.
That's why get the error.

You don't show how you set up your AffymetrixCelSet `csR` object, but
you there are basically two ways to do it.  You can either set up the
AffymetrixCdfSet `cdf` explicitly and pass that when you set up the
`csR` object, or you can do both in one step.  I typically do:

cdf <- AffymetrixCdfFile$byChipType("GenomeWideSNP_6", tags="Full")
csR <- AffymetrixCelSet$byName("Samples", cdf=cdf)

because that is very explicit about which CDF is being used.  You can
also do these two in one step as:

csR <- AffymetrixCelSet$byName("Samples", chipType="GenomeWideSNP_6,full")

But again, I think the first approach is much clearer.  This is also
what http://www.aroma-project.org/vignettes/CRMAv2/ uses.  If you look
at that vignette, it also calls getGenomeInformation(cdf) etc at the
very beginning.  A user don't really need to call those, because
they'll be call internally as needed.  Instead they are there just to
assert that you have all needed annotation data up front and that
Aroma can find them.  So, if you call those on the "full" CDF, you
shouldn't get any errors.  If so, then you know you're ready to go.

Hope this helps to get you started

Henrik

>
>
>
> Everything seems to work fine in the Paired PSCBS protocol (except for one
> thing, which I'll ask at the end), until I run:
>
> res <- doASCRMAv2(csR, verbose=verbose)
>
> 20160512 16:56:04|CRMAv2...
> 20160512 16:56:04| Arguments:
> 20160512 16:56:04| combineAlleles: FALSE
> 20160512 16:56:04| arrays:
>   chr ""
> 20160512 16:56:04| Data set
>  AffymetrixCelSet:
>  Name: Samples
>  Tags:
>  Path: rawData/Samples/GenomeWideSNP_6
>  Platform: Affymetrix
>  Chip type: GenomeWideSNP_6
>  Number of arrays: 116
>  Names: TCGA-3L-AA1B_Normal_Black, TCGA-3L-AA1B_Tumor_Black,
> TCGA-4N-A93T_Normal_Black, ..., TCGA-WS-AB45_Tumor_Black [116]
>  Time period: 2011-03-08 10:19:00 -- 2014-08-21 11:25:30
>  Total file size: 7645.59MB
>  RAM: 0.14MB
> 20160512 16:56:10| Checking whether final results are available or not...
> 20160512 16:56:10| Checking whether final results are available or
> not...done
> 20160512 16:56:10| CRMAv2/Allelic crosstalk calibration...
> [2016-05-12 16:56:11] Exception: Failed to retrieve genome information for
> this chip type: GenomeWideSNP_6
>
>
>   at #28. getGenomeInformation.AffymetrixCdfFile(cdf)
>   - getGenomeInformation.AffymetrixCdfFile() is in environment
> 'aroma.affymetrix'
>
>
>   at #27. getGenomeInformation(cdf)
>   - getGenomeInformation() is in environment 'aroma.affymetrix'
>
>
>   at #26. getSubsetToAvg.AllelicCrosstalkCalibration(this)
>   - getSubsetToAvg.AllelicCrosstalkCalibration() is in environment
> 'aroma.affymetrix'
>
>
>   at #25. getSubsetToAvg(this)
>   - getSubsetToAvg() is in environment 'aroma.affymetrix'
>
>
>   at #24. getParameters.AllelicCrosstalkCalibration(this, ...)
>   - getParameters.AllelicCrosstalkCalibration() is in environment
> 'aroma.affymetrix'
>
>
>   at #23. getParameters(this, ...)
>   - getParameters() is in environment 'aroma.core'
>
>
>   at #22. getParameterSets.ParametersInterface(this, ..., drop = FALSE)
>   - getParameterSets.ParametersInterface() is in environment
> 'aroma.core'
>
>
>   at #21. getParameterSets(this, ..., drop = FALSE)
>   - getParameterSets() is in environment 'aroma.core'
>
>
>   at #20. 

[aroma.affymetrix] "Error: Failed to retrieve genome information" + paired samples

2016-05-13 Thread Gaius Augustus
Hello,
I'm working on the Paired PSCBS protocol, and am running across an error.

Here is my file structure

-annotationData
---chipTypes
-GenomeWideSNP_6
---GenomeWideSNP_6,Full,na33,hg19,dbSNP137,HB20140118.ufl
---GenomeWideSNP_6,Full,na33,hg19,dbSNP137,HB20140118.ugp
---GenomeWideSNP_6,Full.cdf
---GenomeWideSNP_6,HB20080710.acs
---GenomeWideSNP_6.cdf
-rawData
---Samples
-GenomeWideSNP_6
---LIST OF CEL FILES



Everything seems to work fine in the Paired PSCBS protocol (except for one 
thing, which I'll ask at the end), until I run:

res <- doASCRMAv2(csR, verbose=verbose)

20160512 16:56:04|CRMAv2...
20160512 16:56:04| Arguments:
20160512 16:56:04| combineAlleles: FALSE
20160512 16:56:04| arrays:
  chr ""
20160512 16:56:04| Data set
 AffymetrixCelSet:
 Name: Samples
 Tags: 
 Path: rawData/Samples/GenomeWideSNP_6
 Platform: Affymetrix
 Chip type: GenomeWideSNP_6
 Number of arrays: 116
 Names: TCGA-3L-AA1B_Normal_Black, TCGA-3L-AA1B_Tumor_Black, TCGA-4N-
A93T_Normal_Black, ..., TCGA-WS-AB45_Tumor_Black [116]
 Time period: 2011-03-08 10:19:00 -- 2014-08-21 11:25:30
 Total file size: 7645.59MB
 RAM: 0.14MB
20160512 16:56:10| Checking whether final results are available or not...
20160512 16:56:10| Checking whether final results are available or not...
done
20160512 16:56:10| CRMAv2/Allelic crosstalk calibration...
[2016-05-12 16:56:11] Exception: Failed to retrieve genome information for 
this chip type: GenomeWideSNP_6


  at #28. getGenomeInformation.AffymetrixCdfFile(cdf)
  - getGenomeInformation.AffymetrixCdfFile() is in environment 
'aroma.affymetrix'


  at #27. getGenomeInformation(cdf)
  - getGenomeInformation() is in environment 'aroma.affymetrix'


  at #26. getSubsetToAvg.AllelicCrosstalkCalibration(this)
  - getSubsetToAvg.AllelicCrosstalkCalibration() is in environment 
'aroma.affymetrix'


  at #25. getSubsetToAvg(this)
  - getSubsetToAvg() is in environment 'aroma.affymetrix'


  at #24. getParameters.AllelicCrosstalkCalibration(this, ...)
  - getParameters.AllelicCrosstalkCalibration() is in environment 
'aroma.affymetrix'


  at #23. getParameters(this, ...)
  - getParameters() is in environment 'aroma.core'


  at #22. getParameterSets.ParametersInterface(this, ..., drop = FALSE)
  - getParameterSets.ParametersInterface() is in environment 
'aroma.core'


  at #21. getParameterSets(this, ..., drop = FALSE)
  - getParameterSets() is in environment 'aroma.core'


  at #20. getParametersAsString.ParametersInterface(this)
  - getParametersAsString.ParametersInterface() is in environment 
'aroma.core'


  at #19. getParametersAsString(this)
  - getParametersAsString() is in environment 'aroma.core'


  at #18. sprintf("Algorithm parameters: %s", getParametersAsString(this))
  - sprintf() is in environment 'base'


  at #17. as.character.AromaTransform(x)
  - as.character.AromaTransform() is in environment 'aroma.core'


  at #16. as.character(x)
  - as.character() is local of the calling function


  at #15. print(as.character(x))
  - print() is in environment 'base'


  at #14. print.Object(...)
  - print.Object() is in environment 'R.oo'


  at #13. print(...)
  - print() is in environment 'base'


  at #12. eval(expr, envir, enclos)
  - eval() is local of the calling function


  at #11. eval(expr, pf)
  - eval() is in environment 'base'


  at #10. withVisible(eval(expr, pf))
  - withVisible() is in environment 'base'


  at #09. evalVis(expr)
  - evalVis() is local of the calling function


  at #08. capture.Verbose(this, print(...), level = level)
  - capture.Verbose() is in environment 'R.utils'


  at #07. capture(this, print(...), level = level)
  - capture() is in environment 'R.utils'


  at #06. print.Verbose(verbose, acc)
  - print.Verbose() is in environment 'R.utils'


  at #05. print(verbose, acc)
  - print() is in environment 'base'


  at #04. doCRMAv2.AffymetrixCelSet(..., combineAlleles = FALSE)
  - doCRMAv2.AffymetrixCelSet() is in environment 'aroma.affymetrix'


  at #03. doCRMAv2(..., combineAlleles = FALSE)
  - doCRMAv2() is in environment 'aroma.affymetrix'


  at #02. doASCRMAv2.default(csR, verbose = verbose)
  - doASCRMAv2.default() is in environment 'aroma.affymetrix'


  at #01. doASCRMAv2(csR, verbose = verbose)
  - doASCRMAv2() is in environment 'aroma.affymetrix'


Error: Failed to retrieve genome information for this chip type: 
GenomeWideSNP_6


*Problems*
1) As you can see, I'm getting the Failed to Retrieve genome information 
error.  Looking through the forums and the site, it seems that I only need 
the acs, ufl, cdf, and ugp files.  Those are in the annotationData folder. 
 So I assume I'm doing something else wrong.
2) I have ~60 paired samples I'd like to run through.  The example only 
notes how to