Re: [aroma.affymetrix] Unpaired copy number analysis with doCRMAv2

2011-06-17 Thread Henrik Bengtsson
On Fri, Jun 17, 2011 at 12:15 PM, Henrik Bengtsson
henrik.bengts...@aroma-project.org wrote:
 Hi.

 On Thu, Jun 16, 2011 at 9:52 AM, Xavi Sole xavier.s...@gmail.com wrote:
 Hi all,
 I have successfully performed CRMAv2 on a set of ~200 Affy 6.0 samples. I
 have tumor and normal samples, and now I want to apply an unpaired CBS
 model. I know ho to do the paired model, but I'm not sure is this code is
 right for the unpaired one:
 library(aroma.affymetrix);
 library(aroma.cn);

 No need for 'aroma.cn' here.

 verbose - Arguments$getVerbose(-8, timestamp=TRUE);
 options(digits=4);
 setOption(aromaSettings, memory/ram, 500.0)
 # Verify that the data set can be located:
 cdf - AffymetrixCdfFile$byChipType(GenomeWideSNP_6, tags=Full);
 dataSet - CEL_files;

 Each analysis you run should have a unique data set name, i.e. what
 name should you use next time?  I recommend to use a much more
 informative name of your data set - something that reflects the
 project name and/or the investigator and the year, or similar.

 dsR - AffymetrixCelSet$byName(dataSet, cdf=cdf);
 setFullName(dsR, dataSet);

 No need to do this.

 # Using CRMAv2
 dsList - doCRMAv2(dataSet, cdf=cdf, combineAlleles=FALSE, verbose=verbose);

 Since you've already have setup an AffymetrixCelSet object ('dsR')
 above, you should use that instead:

 dsList - doCRMAv2(dsR, combineAlleles=FALSE, verbose=verbose);

 or even shorter for allele-specific analysis, you can equivalently do:

 dsList - doASCRMAv2(dsR, verbose=verbose);

 dsC - dsList$total;
 # Splitting the dataset in normal-tumor pairs
 infoSamples - read.table(samples.txt, sep=\t, header=TRUE, as.is=TRUE,
 strip.white=TRUE);
 #Rows in this file are in the same order as arrays in the aroma dataset.
 idxN - which(infoSamples$type==Normal);
 idxT - which(infoSamples$type==Tumor);

 This is a bit dangerous since it assumes that the rows in your
 'samples.txt' file is ordered the same way as the files on this.  At
 least I would assert that that is the case by something like:

 stopifnot(all(infoSamples$name == getName(dsC));

 or similar.

 dsN - extract(dsC, idxN);
 dsT - extract(dsC, idxT);
 # Circular binary segmentation T vs N: Paired model (first N sample pairs
 with first T sample)
 cns - CbsModel(dsT, dsN);
 fit(cns, verbose=TRUE);
 regDatList - getRegions(cns);

 Yes, this does a tumor-normal paired CBS segmentation (and extracts
 the table of regions afterward).

 # Circular binary segmentation T vs N: Unpaired: I DON'T KNOW IF THIS IS
 CORRECT (cns.unpaired keeps me telling that the reference sample is made of
 102 files instead of one!).
 avgN - getAverageFile(dsN);

 This is the correct command and 'avgN' should *indeed* a single file
 (array) here.  You should be able verify by looking at what:

 print(avgN);

 returns.

 cns.unpaired - CbsModel(dsT, avgN);

 This should work, because 'dsT' is a data set; verify by

 print(dsT);

 If it doesn't work please let us know what:

 print(dsT);
 print(avgN);

 returns and what the complete error message is.

 fit(cns.unpaired, verbose=TRUE);
 regDatList.unpaired - getRegions(cns.unpaired);

I forgot to add that you should find the recent thread 'Combine paired
and unpaired CN analysis' (started on May 23, 2001) useful:

https://groups.google.com/forum/#!topic/aroma-affymetrix/220dWcCxxn8

/Henrik


 [snip]

 I'll answer your second question in a separate thread ...and when I
 find the time.

 /Henrik

 Thank you very much for your help!
 Best,
 Xavi.
 #SESSION INFO
 sessionInfo()
 R version 2.13.0 (2011-04-13)
 Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
 locale:
 [1] C
 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base
 other attached packages:
  [1] DNAcopy_1.26.0         aroma.cn_0.5.2         aroma.affymetrix_2.1.0
  [4] aroma.apd_0.1.8        affxparser_1.24.0      R.huge_0.2.2
  [7] aroma.core_2.1.0       aroma.light_1.20.0     matrixStats_0.2.2
 [10] R.rsp_0.5.3            R.cache_0.4.2          R.filesets_1.0.1
 [13] digest_0.5.0           R.utils_1.7.5          R.oo_1.8.0
 [16] R.methodsS3_1.2.1
 loaded via a namespace (and not attached):
 [1] tools_2.13.0

 --
 When reporting problems on aroma.affymetrix, make sure 1) to run the latest
 version of the package, 2) to report the output of sessionInfo() and
 traceback(), and 3) to post a complete code example.


 You received this message because you are subscribed to the Google Groups
 aroma.affymetrix group with website http://www.aroma-project.org/.
 To post to this group, send email to aroma-affymetrix@googlegroups.com
 To unsubscribe and other options, go to http://www.aroma-project.org/forum/



-- 
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
aroma.affymetrix group with website 

Re: [aroma.affymetrix] Unpaired copy number analysis with doCRMAv2

2011-06-17 Thread Xavi Sole
Dear Henrik,

thank you so much for your complete answer! I'll try your suggestions on
monday and give you some feedback!

Best,

Xavi.

2011/6/17 Henrik Bengtsson henrik.bengts...@aroma-project.org

 Hi.

 On Thu, Jun 16, 2011 at 9:52 AM, Xavi Sole xavier.s...@gmail.com wrote:
  Hi all,
  I have successfully performed CRMAv2 on a set of ~200 Affy 6.0 samples. I
  have tumor and normal samples, and now I want to apply an unpaired CBS
  model. I know ho to do the paired model, but I'm not sure is this code is
  right for the unpaired one:
  library(aroma.affymetrix);
  library(aroma.cn);

 No need for 'aroma.cn' here.

  verbose - Arguments$getVerbose(-8, timestamp=TRUE);
  options(digits=4);
  setOption(aromaSettings, memory/ram, 500.0)
  # Verify that the data set can be located:
  cdf - AffymetrixCdfFile$byChipType(GenomeWideSNP_6, tags=Full);
  dataSet - CEL_files;

 Each analysis you run should have a unique data set name, i.e. what
 name should you use next time?  I recommend to use a much more
 informative name of your data set - something that reflects the
 project name and/or the investigator and the year, or similar.

  dsR - AffymetrixCelSet$byName(dataSet, cdf=cdf);
  setFullName(dsR, dataSet);

 No need to do this.

  # Using CRMAv2
  dsList - doCRMAv2(dataSet, cdf=cdf, combineAlleles=FALSE,
 verbose=verbose);

 Since you've already have setup an AffymetrixCelSet object ('dsR')
 above, you should use that instead:

 dsList - doCRMAv2(dsR, combineAlleles=FALSE, verbose=verbose);

 or even shorter for allele-specific analysis, you can equivalently do:

 dsList - doASCRMAv2(dsR, verbose=verbose);

  dsC - dsList$total;
  # Splitting the dataset in normal-tumor pairs
  infoSamples - read.table(samples.txt, sep=\t, header=TRUE, as.is
 =TRUE,
  strip.white=TRUE);
  #Rows in this file are in the same order as arrays in the aroma dataset.
  idxN - which(infoSamples$type==Normal);
  idxT - which(infoSamples$type==Tumor);

 This is a bit dangerous since it assumes that the rows in your
 'samples.txt' file is ordered the same way as the files on this.  At
 least I would assert that that is the case by something like:

 stopifnot(all(infoSamples$name == getName(dsC));

 or similar.

  dsN - extract(dsC, idxN);
  dsT - extract(dsC, idxT);
  # Circular binary segmentation T vs N: Paired model (first N sample pairs
  with first T sample)
  cns - CbsModel(dsT, dsN);
  fit(cns, verbose=TRUE);
  regDatList - getRegions(cns);

 Yes, this does a tumor-normal paired CBS segmentation (and extracts
 the table of regions afterward).

  # Circular binary segmentation T vs N: Unpaired: I DON'T KNOW IF THIS IS
  CORRECT (cns.unpaired keeps me telling that the reference sample is made
 of
  102 files instead of one!).
  avgN - getAverageFile(dsN);

 This is the correct command and 'avgN' should *indeed* a single file
 (array) here.  You should be able verify by looking at what:

 print(avgN);

 returns.

  cns.unpaired - CbsModel(dsT, avgN);

 This should work, because 'dsT' is a data set; verify by

 print(dsT);

 If it doesn't work please let us know what:

 print(dsT);
 print(avgN);

 returns and what the complete error message is.

  fit(cns.unpaired, verbose=TRUE);
  regDatList.unpaired - getRegions(cns.unpaired);

 [snip]

 I'll answer your second question in a separate thread ...and when I
 find the time.

 /Henrik

  Thank you very much for your help!
  Best,
  Xavi.
  #SESSION INFO
  sessionInfo()
  R version 2.13.0 (2011-04-13)
  Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
  locale:
  [1] C
  attached base packages:
  [1] stats graphics  grDevices utils datasets  methods   base
  other attached packages:
   [1] DNAcopy_1.26.0 aroma.cn_0.5.2 aroma.affymetrix_2.1.0
   [4] aroma.apd_0.1.8affxparser_1.24.0  R.huge_0.2.2
   [7] aroma.core_2.1.0   aroma.light_1.20.0 matrixStats_0.2.2
  [10] R.rsp_0.5.3R.cache_0.4.2  R.filesets_1.0.1
  [13] digest_0.5.0   R.utils_1.7.5  R.oo_1.8.0
  [16] R.methodsS3_1.2.1
  loaded via a namespace (and not attached):
  [1] tools_2.13.0
 
  --
  When reporting problems on aroma.affymetrix, make sure 1) to run the
 latest
  version of the package, 2) to report the output of sessionInfo() and
  traceback(), and 3) to post a complete code example.
 
 
  You received this message because you are subscribed to the Google Groups
  aroma.affymetrix group with website http://www.aroma-project.org/.
  To post to this group, send email to aroma-affymetrix@googlegroups.com
  To unsubscribe and other options, go to
 http://www.aroma-project.org/forum/
 

 --
 When reporting problems on aroma.affymetrix, make sure 1) to run the latest
 version of the package, 2) to report the output of sessionInfo() and
 traceback(), and 3) to post a complete code example.


 You received this message because you are subscribed to the Google Groups
 aroma.affymetrix group with website http://www.aroma-project.org/.
 To post to