Re: [aroma.affymetrix] Load Already Preprocessed Arrays

Henrik Bengtsson Thu, 20 Jan 2011 02:03:42 -0800

Hi.

On Wed, Jan 19, 2011 at 11:54 PM, Gregory W <greg.d.w...@gmail.com> wrote:
> Hello,
>
> I've been able to take advantage of CRMAv2 ability to process arrays
> in parallel, which is great.
>
> However, when I revisit the data to run, say, CBS on different subsets
> of the samples I encounter a very long data load time.
>
> In particular when I open a new R session I basically run the
> following code:
>
>> chipType <- "GenomeWideSNP_6"
>> dataSet   <- "MyData"
>
>> cdf     <- AffymetrixCdfFile$byChipType(chipType, tags="Full")
>> csAll  <- AffymetrixCelSet$byName(dataSet, cdf=cdf)
>
> I then load the normalized results with:
>
>> dsC <- doCRMA( csAll, cdf=cdf, combineAlleles=FALSE, verbose=verbose )
>
> This step takes a very long time even though Crosstalk, Base Position,
> Probe Summarization, and Fragment Length has already been performed on
> the samples.
>
> I get the following output from my R session when I execute the above
> statement:
>
> 20110118 14:54:22|  Identifying non-estimated units...
> 20110118 14:54:22|   Getting chip-effect set from data set...
> 20110118 14:54:22|    Retrieving monocell CDF...
> 20110118 14:54:22|     Monocell chip type:
> GenomeWideSNP_6,Full,monocell
> 20110118 14:54:22|     Locating monocell CDF...
> 20110118 14:54:22|      Pathname: annotationData/chipTypes/
> GenomeWideSNP_6/GenomeWideSNP_6,Full,monocell.CDF
> 20110118 14:54:22|     Locating monocell CDF...done
> 20110118 14:54:22|    Retrieving monocell CDF...done
> 20110118 14:54:22|    Retrieving chip-effects from data set...
> 20110118 14:54:22|     Data set: LungAd
> 20110118 14:54:22|     Retrieving chip-effect #1 of 70 (Patient1)...
> 20110118 14:54:22|      Allocating empty chip-effect file...
> 20110118 14:54:22|       Pathname: plmData/Lung,ACC,ra,-XY,+300,AVG,
> +300,A+B/GenomeWideSNP_6/Patient1,chipEffects.CEL
> 20110118 14:54:22|       Temporary pathname: plmData/Lung,ACC,ra,-XY,
> +300,AVG,+300,A+B/GenomeWideSNP_6/Patient1,chipEffects.CEL.tmp
> 20110118 14:54:22|       Renaming temporary file...
> 20110118 14:54:22|       Renaming temporary file...done
> 20110118 14:54:22|      Allocating empty chip-effect file...done
> 20110118 14:54:22|      Setting up CnChipEffectFile...
> 20110118 14:54:22|       Pathname: plmData/Lung,ACC,ra,-XY,+300,AVG,
> +300,A+B/GenomeWideSNP_6/Patient1,chipEffects.CEL
> 20110118 14:54:23|      Setting up CnChipEffectFile...done
> 20110118 14:54:23|     Retrieving chip-effect #1 of 70
> (Patient1)...done
> 20110118 14:54:23|     Retrieving chip-effect #2 of 70 (Patient2)...
> 20110118 14:54:23|      Allocating empty chip-effect file...
> 20110118 14:54:23|       Pathname: plmData/Lung,ACC,ra,-XY,+300,AVG,
> +300,A+B/GenomeWideSNP_6/Patient2,chipEffects.CEL
> 20110118 14:54:23|       Temporary pathname: plmData/LungAd,ACC,ra,-XY,
> +300,AVG,+300,A+B/GenomeWideSNP_6/Patient2,chipEffects.CEL.tmp
> 20110118 14:54:23|       Renaming temporary file...
> 20110118 14:54:23|       Renaming temporary file...done
> 20110118 14:54:23|      Allocating empty chip-effect file...done
> 20110118 14:54:23|      Setting up CnChipEffectFile...
> 20110118 14:54:23|       Pathname: plmData/LungAd,ACC,ra,-XY,+300,AVG,
> +300,A+B/GenomeWideSNP_6/Patient2,chipEffects.CEL
>
> and it goes on for hours...


Yes, but only the first time.  Then it should be fairly quick.

>
> Once it finishes I then runnings something like this:
>
>>  dsR  <- getAverageFile(dsC$total)
>>  dsT  <- extract(dsC$total, subset.of.interest)
>
>>  cns  <- CbsModel( dsT, dsR )
>
>
> Could anyone give me suggestions of how I could accomplish rerunning
> CBS for different subsets without having to constantly execute:
>
>> dsC <- doCRMA( csAll, cdf=cdf, combineAlleles=FALSE, verbose=verbose )
>
> to load the normalized results?

You can grab the 'dsC$total' data set as follows:

dataSet <- "Lung";
tags <- "ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY";
chipType <- "GenomeWideSNP_6";
dsT <- AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, chipType=chipType);

This is described in the how to 'Setting up an
AromaUnitTotalCnBinarySet or an AromaUnitFracBCnBinarySet':

  http://aroma-project.org/howtos/SetupOfAromaUnitNnnCnBinarySet

Then wrap it up in the dsC list as:

dsC <- list();
dsC$total <- dsT;

Hope this helps.

/Henrik


>
> Thanks a bunch in advance!
> Greg
>
> --
> When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
> version of the package, 2) to report the output of sessionInfo() and 
> traceback(), and 3) to post a complete code example.
>
>
> You received this message because you are subscribed to the Google Groups 
> "aroma.affymetrix" group with website http://www.aroma-project.org/.
> To post to this group, send email to aroma-affymetrix@googlegroups.com
> To unsubscribe and other options, go to http://www.aroma-project.org/forum/
>

-- 
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group with website http://www.aroma-project.org/.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe and other options, go to http://www.aroma-project.org/forum/

Re: [aroma.affymetrix] Load Already Preprocessed Arrays

Reply via email to