Re: [aroma.affymetrix] Re: Combine paired and unpaired CN analysis

2011-06-27 Thread Henrik Bengtsson
Hi.

On Tue, Jun 21, 2011 at 9:30 AM, Kai wangz...@gmail.com wrote:
 Hi Henrik,

 Thanks for your help. However, I got some errors when trying our what
 you have suggested. I think the problem was that: when I did

 cns - CbsModel(dsR,dsP);

 The sample order in cns is no longer the same as that in dsR and
 dsP. So is there a way to change the sample name in dsR, since after
 the CbsModel is constructed, it is difficult to figure out which
 sample is which.

Yes, I think the default

 Also, the error message from setAlias() seems to suggest that this
 function can only change the name of individual sample, not a sample
 set. Is this correct?

Yes, that is a correct error message.  I'm sorry, I did a mistake;
setAlias() is for renaming the data set name not the individual
samples.  So, scrap all my suggestions about using setAlias() for what
you are trying to do.  Instead...

1. The names used in the ChromosomeExplorer ('ce') are those given by
getNames(ce) which in turn equals getNames(getModel(ce)) where
getModel(ce) is the CbsModel, that is, getNames(cns) is what is used
by ChromosomeExplorer.

2. You can tweak the behavior of getNames(cns) such that in turn calls
getPairedNames(cns).  This is still an unofficial tweak, so you have
to do the following to activate it:

  cns$.usePairedNames - TRUE;

Now, when you call getNames(ce), which returns getNames(cns), which in
turn returns getPairedNames(cns).

3. The default behavior of getPairedNames() is to return names of type
(i) GSM226871vsGSM226870 when there is matched reference sample,
whereas it returns (ii) GSM226867 when the reference samples is the
average of a pool.  In other words, in your case you should get names
of type (ii) whenever you have 'dfRef' as the reference.

Let me know if this works for you

Henrik

PS. If the above is not enough, there is even a way to customize how
the paired names are generated.  See the source code
print(getPairedNames.CopyNumberChromosomalModel) on what is going on.
Note however that this is a very unofficial solution which may change
in the future.



 Thank you again for your help. Below are the details of my codes and
 the error message I got.

 Best,
 Kai

 library(aroma.affymetrix);
 # Preprocessing using CRMAv2
 dataSet = Resistant_model;
 chipType = GenomeWideSNP_6;
 cdf - AffymetrixCdfFile$byChipType(chipType, tags=Full);
 dsC = doCRMAv2(dataSet, cdf=cdf);
 getNames(dsC)
 [1] Parental Resistant

 # paired analysis
 idxP - match(Parental, getNames(dsC));
 dsP - extract(dsC, idxP);
 getNames(dsP)
 [1] Parental

 idxR - match(Resistant, getNames(dsC));
 dsR - extract(dsC, idxR);
 getNames(dsR)
 [1] Resistant

 # unpaired analysis
 ds - 
 AromaUnitTotalCnBinarySet$byName(SNP6_HAPMAP_FEMALE,tags=ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY,
  chipType=chipType);
 # get Hapmap average file
 dfRef - getAverageFile(ds, verbose=verbose);
 # combine paired and unpaired analysis
 dfRefList - rep(list(dfRef), times=nbrOfArrays(dsC));
 dsRef - AromaUnitTotalCnBinarySet(dfRefList);
 getNames(dsRef)
 [1] .average-signals-median-mad .average-signals-median-mad

 append(dsR,dsC);
 getNames(dsR)
 [1] Resistant Parental Resistant

 append(dsP,dsRef);
 getNames(dsP)
 [1] Parental                      .average-signals-median-mad
 [3] .average-signals-median-mad

 ### Everything up to this point is fine ###

 cns - CbsModel(dsR,dsP);
 getNames(cns)
 [1] Parental Resistant Resistant

 ### Now it seems that the CbsModel has re-ordered the samples 

 ce - ChromosomeExplorer(cns);
 names - getNames(ce);
 names
 [1] Parental Resistant Resistant

 ### Again, samples in ce are ordered according to cns, not dsR
 and dsP ###

 is_hapmap - (getNames(dsP)==getName(dfRef));
 names[is_hapmap] - sprintf(%s-vs-HapMap, names[is_hapmap]);
 names
 [1] Parental           Resistant-vs-HapMap Resistant-vs-HapMap

 setAlias(ce,names);
 Error in list(`setAlias(ce, names)` = environment, `setAlias.Explorer(ce, 
 names)` = environment,  :

 [2011-06-21 09:14:47] Exception: Number of elements in argument
 'alias' is out of range [0,1]: 3
  at throw(Exception(...))
  at throw.default(sprintf(Number of elements in argument '%s' is out
 of range
  at throw(sprintf(Number of elements in argument '%s' is out of
 range [%d,%d]:
  at getVector.Arguments(static, s, length = length, .name = .name)
  at getVector(static, s, length = length, .name = .name)
  at getCharacters.Arguments(static, ..., length = length)
  at getCharacters(static, ..., length = length)
  at getCharacter.Arguments(static, filename, nchar = nchar, .name
 = .name)
  at getCharacter(static, filename, nchar = nchar, .name = .name)
  at method(static, ...)
  at Arguments$getFilename(alias)
  at setAlias.Explorer(ce, names)
  at setAlias(ce, names)

 On Jun 17, 11:48 am, Henrik Bengtsson henrik.bengts...@aroma-
 project.org wrote:
 Hi.



 On Fri, Jun 17, 2011 at 10:05 AM, Kai wangz...@gmail.com wrote:
  Hi Henrik,

  Thank you very much for your reply. What you have suggested works,
  with the 

[aroma.affymetrix] Re: Combine paired and unpaired CN analysis

2011-06-21 Thread Kai
Hi Henrik,

Thanks for your help. However, I got some errors when trying our what
you have suggested. I think the problem was that: when I did

cns - CbsModel(dsR,dsP);

The sample order in cns is no longer the same as that in dsR and
dsP. So is there a way to change the sample name in dsR, since after
the CbsModel is constructed, it is difficult to figure out which
sample is which.

Also, the error message from setAlias() seems to suggest that this
function can only change the name of individual sample, not a sample
set. Is this correct?

Thank you again for your help. Below are the details of my codes and
the error message I got.

Best,
Kai

 library(aroma.affymetrix);
 # Preprocessing using CRMAv2
 dataSet = Resistant_model;
 chipType = GenomeWideSNP_6;
 cdf - AffymetrixCdfFile$byChipType(chipType, tags=Full);
 dsC = doCRMAv2(dataSet, cdf=cdf);
 getNames(dsC)
[1] Parental Resistant

 # paired analysis
 idxP - match(Parental, getNames(dsC));
 dsP - extract(dsC, idxP);
 getNames(dsP)
[1] Parental

 idxR - match(Resistant, getNames(dsC));
 dsR - extract(dsC, idxR);
 getNames(dsR)
[1] Resistant

 # unpaired analysis
 ds - 
 AromaUnitTotalCnBinarySet$byName(SNP6_HAPMAP_FEMALE,tags=ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY,
  chipType=chipType);
 # get Hapmap average file
 dfRef - getAverageFile(ds, verbose=verbose);
 # combine paired and unpaired analysis
 dfRefList - rep(list(dfRef), times=nbrOfArrays(dsC));
 dsRef - AromaUnitTotalCnBinarySet(dfRefList);
 getNames(dsRef)
[1] .average-signals-median-mad .average-signals-median-mad

 append(dsR,dsC);
 getNames(dsR)
[1] Resistant Parental Resistant

 append(dsP,dsRef);
 getNames(dsP)
[1] Parental  .average-signals-median-mad
[3] .average-signals-median-mad

### Everything up to this point is fine ###

 cns - CbsModel(dsR,dsP);
 getNames(cns)
[1] Parental Resistant Resistant

### Now it seems that the CbsModel has re-ordered the samples 

 ce - ChromosomeExplorer(cns);
 names - getNames(ce);
 names
[1] Parental Resistant Resistant

### Again, samples in ce are ordered according to cns, not dsR
and dsP ###

 is_hapmap - (getNames(dsP)==getName(dfRef));
 names[is_hapmap] - sprintf(%s-vs-HapMap, names[is_hapmap]);
 names
[1] Parental   Resistant-vs-HapMap Resistant-vs-HapMap

 setAlias(ce,names);
 Error in list(`setAlias(ce, names)` = environment, `setAlias.Explorer(ce, 
 names)` = environment,  :

[2011-06-21 09:14:47] Exception: Number of elements in argument
'alias' is out of range [0,1]: 3
  at throw(Exception(...))
  at throw.default(sprintf(Number of elements in argument '%s' is out
of range
  at throw(sprintf(Number of elements in argument '%s' is out of
range [%d,%d]:
  at getVector.Arguments(static, s, length = length, .name = .name)
  at getVector(static, s, length = length, .name = .name)
  at getCharacters.Arguments(static, ..., length = length)
  at getCharacters(static, ..., length = length)
  at getCharacter.Arguments(static, filename, nchar = nchar, .name
= .name)
  at getCharacter(static, filename, nchar = nchar, .name = .name)
  at method(static, ...)
  at Arguments$getFilename(alias)
  at setAlias.Explorer(ce, names)
  at setAlias(ce, names)

On Jun 17, 11:48 am, Henrik Bengtsson henrik.bengts...@aroma-
project.org wrote:
 Hi.



 On Fri, Jun 17, 2011 at 10:05 AM, Kai wangz...@gmail.com wrote:
  Hi Henrik,

  Thank you very much for your reply. What you have suggested works,
  with the only problem being that there are now repeated sample names
  such that the links generated by ChromosomeExplorer don't work
  properly. I think this is because that I have analyzed the same sample
  twice, once w.r.t the matched parental DNA (paired analysis), the
  other time w.r.t. the Hapmap average (unpaired analysis). So my
  question is whether there is a way I can change the names of datasets
  in a AromaUnitTotalCnBinarySet. My current codes look like the
  following:

  library(aroma.affymetrix);
  # Preprocessing using CRMAv2
  dataSet = Resistant_model;
  chipType = GenomeWideSNP_6;
  cdf - AffymetrixCdfFile$byChipType(chipType, tags=Full);
  dsC = doCRMAv2(dataSet, cdf=cdf);
  # paired analysis
  idxP - match(Parental, getNames(dsC));
  dsP - extract(dsC, idxP);
  idxR - match(Resistant, getNames(dsC));
  dsR - extract(dsC, idxR);

 Here I would add a sanity check just in case:

 stopifnot(nbrOfArrays(dsP) == nbrOfArrays(dsR));

 because that is what you assume when you setup CbsModel() below, correct?

  # unpaired analysis
  ds - AromaUnitTotalCnBinarySet$byName(SNP6_HAPMAP_FEMALE,
  tags=ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY, chipType=chipType);
  # get Hapmap average file
  dfRef - getAverageFile(ds, verbose=verbose);
  # combine paired and unpaired analysis
  dfRefList - rep(list(dfRef), times=2);
  dsRef - AromaUnitTotalCnBinarySet(dfRefList);

 Here you create a data set 'dsRef' with *two* copies of the
 HapMap-pooled reference file...

  append(dsR,dsC);
  append(dsP,dsRef);

 ...and from those two lines, I suspect it is 

[aroma.affymetrix] Re: Combine paired and unpaired CN analysis

2011-06-17 Thread Kai
Hi Henrik,

Thank you very much for your reply. What you have suggested works,
with the only problem being that there are now repeated sample names
such that the links generated by ChromosomeExplorer don't work
properly. I think this is because that I have analyzed the same sample
twice, once w.r.t the matched parental DNA (paired analysis), the
other time w.r.t. the Hapmap average (unpaired analysis). So my
question is whether there is a way I can change the names of datasets
in a AromaUnitTotalCnBinarySet. My current codes look like the
following:

library(aroma.affymetrix);
# Preprocessing using CRMAv2
dataSet = Resistant_model;
chipType = GenomeWideSNP_6;
cdf - AffymetrixCdfFile$byChipType(chipType, tags=Full);
dsC = doCRMAv2(dataSet, cdf=cdf);
# paired analysis
idxP - match(Parental, getNames(dsC));
dsP - extract(dsC, idxP);
idxR - match(Resistant, getNames(dsC));
dsR - extract(dsC, idxR);
# unpaired analysis
ds - AromaUnitTotalCnBinarySet$byName(SNP6_HAPMAP_FEMALE,
tags=ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY, chipType=chipType);
# get Hapmap average file
dfRef - getAverageFile(ds, verbose=verbose);
# combine paired and unpaired analysis
dfRefList - rep(list(dfRef), times=2);
dsRef - AromaUnitTotalCnBinarySet(dfRefList);
append(dsR,dsC);
append(dsP,dsRef);
cns - CbsModel(dsR,dsP);
ce - ChromosomeExplorer(cns);
process(ce, verbose=verbose);

The problem arose when I appended dsC to dsR: since dsC already
contains dsR, the resulting variable will contain three datasets:
(dsR, dsR, dsP). The corresponding reference set contains (dsP,
Hapmap, Hapmap). These three comparisons are what I intended to do,
but the two dsRs will have the same name which has caused the
ChromosomeExplorer display problem. Any help you can provided here is
much appreciated!

Best,
Kai


On May 25, 6:20 pm, Henrik Bengtsson henrik.bengts...@aroma-
project.org wrote:
 Hi,

 sorry for the delay - your question required some extra explanation.



 On Mon, May 23, 2011 at 2:37 PM, Kai wangz...@gmail.com wrote:
  Hi, Henrik,

  I have some Affymetrix SNP6 data on a resistant cell line model after
  compound treatment. I want to do a paired analysis comparing the
  resistant cell line data to the parental. To make sure that the
  changes I see in the paired analysis is indeed acquired aberrations,
  not simply a magnitude change between the resistant and parental, I
  also want to analyze the resistant and parental data separately
  against a common reference (e.g. Hapmap). So my question is how I can
  combine both a paired and an unpaired analysis in aroma affymetrix/cn.

  Some codes I have for the paired analysis are as follows:

  library(aroma.affymetrix);
  verbose - Arguments$getVerbose(-8, timestamp=TRUE);

  options(digits=4);
  dataSet = Resistant_model;
  chipType = GenomeWideSNP_6;
  cdf - AffymetrixCdfFile$byChipType(chipType, tags=Full);

  dsC = doCRMAv2(dataSet, cdf=cdf, verbose=verbose);
  print(dsC);

  # paired analysis
  idxP - match(Parental, getNames(dsC));
  dsP - extract(dsC, idxP);

  idxR - match(Resistant, getNames(dsC));
  dsR - extract(dsC, idxR);

  cns2 - CbsModel(dsR,dsP);
  print(cns2);

 So, first thing to note is that you are here by specifying (dsR, dsP)
 manually telling CbsModel how to pair the samples, i.e. first sample
 in dsR will be coupled with the first sample in dsP, the second with
 each other and so on.



  ce2 - ChromosomeExplorer(cns2);

 Here, ChromosomeExplorer is just doing whatever CbsModel is setup to
 do, so it has nothing to do with ChromosomeExplorer, i.e. the study
 design is specified when you setup the CbsModel.

  print(ce2);

  process(ce2, verbose=verbose);

  I have also loaded a set of preprocessed Hapmap samples as follows:

  ds - AromaUnitTotalCnBinarySet$byName(SNP6_HAPMAP_FEMALE,
  tags=ACC,ra,-XY,BPN,-XY,AVG,A+B,FLN,-XY,
  chipType=GenomeWideSNP_6);
  print(ds);

  dsRef - getAverageFile(ds, verbose=verbose);
  print(dsRef);

 You should really name this 'dfRef to indicate that it is a data
 *file* (and not a data set).

 Ok, so here you have (i) a data set 'ds' and (ii) a data *file*
 'dfRef'.  What is useful to understand is that when you do:

 seg - CbsModel(ds);

 you are implicitly doing:

 seg - CbsModel(ds, dfRef);

 which in turn corresponds to specifying:

 # Create a data set of replicated reference samples
 dfRefList - rep(list(dfRef), times=length(ds));
 dsRef - AromaUnitTotalCnBinarySet(dfRefList);

 # Request a paired segmentation
 seg - CbsModel(ds, dsRef);

 Note, you now have two CbsModel:s where both do paired segmentation.
  What is left to do is make these two into one CbsModel:

 # Append the latter data sets to the first ones
 append(dsR, ds);
 append(dsP, dsRef);

 # Vola
 seg - CbsModel(dsR, dsP);

 Hope this helps

 /Henrik



  I was wondering whether there is a way that will allow me to construct
  a AromaUnitTotalCnBinarySet consisting of: 1) Resistant, 2) Resistant
  and 3) Parental; then compare it to another reference set consisting
  of: