Re: [aroma.affymetrix] 500K by doCRMAv2

2013-06-11 Thread Henrik Bengtsson
On Mon, Jun 10, 2013 at 7:42 AM, Wei Tang tangwei1...@gmail.com wrote:
 I have got to this step

 doCBS(dsTTuple, ref=dsNTuple, verbose=-10)

 Would you advise me what is the next? how to connect to the follow calling
 CN step.

All you have to do is:

cns - CbsModel(dsTTuple, ref=dsNTuple);


 fit(cns, verbose=verbose)

[Actually, doing CbsModel()+fit() is equivalent to calling doCBS(), so
you can skip doCBS() - doesn't matter].

 ce - ChromosomeExplorer(cns)
 process(ce, verbose=verbose)

 another question, when I am doing allelic specific CN, I should follow
 TumorBoost, right? Or, is there some changes?

You can follow the Paired PSCBS vignette online
[http://aroma-project.org/vignettes/PairedPSCBS-lowlevel].  By default
Paired PSCBS does TumorBoost normalization automatically before
segmentation.  You should also be aware of the Paired PSCBS (PDF)
vignette that is part of the PSCBS package.

/Henrik



 Many thanks,

 Wei


 On Thu, Jun 6, 2013 at 5:38 PM, Henrik Bengtsson
 henrik.bengts...@aroma-project.org wrote:

 On Thu, Jun 6, 2013 at 1:43 PM, Wei Tang tangwei1...@gmail.com wrote:
  Everything goes very well. Nsp and Sty matched.

 Great to hear.

 
  BUT I forget to ask in which step should I pair the samples with tumor
  and
  normal. According to ,
 
  Vignette: Paired total copy-number analysis
 
  cns - CbsModel(dsT, dsN)
 
  now I am using doCBS, how can I connect to the above step?

 Good question.  So, for the tumor-normal use case with multiple chip
 types, you need to do a bit of manual setup first.  Assuming your
 tumors and normals are within the existing data set you use, here is
 an illustration.

 # General setup
 dataSet - EC500K;
 tags - ACC,-XY,BPN,-XY,RMA,A+B,FLN,-XY;
 chipTypes - c(Mapping250K_Nsp, Mapping250K_Sty);

 # Setup CRMAv2-generated data sets [this is what doCBS() does internally]
 dsList - lapply(chipTypes, FUN=function(chipType) {
   AromaUnitTotalCnBinarySet$byName(dataSet, tags=tags, chipType=chipType)
 });
 dsTuple - as.CopyNumberDataSetTuple(dsList);

 # Try...
 print(dsTuple);

 # ...and check the array names that CBS match up.
 print(getNames(dsTuple));


 FYI, you could now launch the non-paired doCBS() as before using this
 data-set tuple class:
 doCBS(dsTuple, verbose=-10);

 However, for the matched tumor-normal segmentation, you need to split
 up your data-set tuple into a tumor and a normal one.  Assuming you
 know the indices of the samples in the above tuple as ordered by
 getNames(dsTuple), you can do:

 # Example:
 idxsT - c(1:3, 6:7)
 idxsN - c(4:5, 8:10)

 dsTTuple - extract(dsTuple, idxsT);
 dsNTuple - extract(dsTuple, idxsN);

 # Verify you have the correct samples:
 print(getNames(dsTTuple));
 print(getNames(dsNTuple));

 # Assert that they have the same number of samples:
 # [doCBS() will do this too]
 stopifnot(length(dsTTuple) == length(dsNTuple));

 # Matched tumor-normal segmentation (pay attention to the verbose at
 the beginning)
 doCBS(dsTTuple, ref=dsNTuple, verbose=-10);

 Hope this helps

 Henrik

 
  Many thanks,
 
  Wei
 
 
  On Wednesday, June 5, 2013 5:50:52 PM UTC-4, Henrik Bengtsson wrote:
 
  JOn Wed, Jun 5, 2013 at 2:15 PM, Wei Tang tangw...@gmail.com wrote:
   do they need to be the same names in 500K? How about SNP5, they are
   additional samples.
 
  Yes (I did write this in my long initial message).  doCBS() identifies
  which tuples of arrays belongs to which samples by matching up their
  *names*, that is, by looking at:
 
  getNames(dsC_EC500K_Nsp)
  getNames(dsC_EC500K_Sty)
 
  Note the difference between *names* and *full names* (cf.
  http://aroma-project.org/definitions/namesAndTags).  In your case the
  *names* are:
 
   getNames(dsC_EC500K_Sty)
  [1] E1507T_STY  E1510T_STY  E1520T_STY ... SHE1796_STY
   getNames(dsC_EC500K_Nsp)
  [1] E1507T_Nsp  E1510T_Nsp  E1520T_Nsp ... SHE1796_NSP
 
  Because of this, doCBS() fails to pair them up.  So, yes, if they would
  be:
 
   getNames(dsC_EC500K_Sty)
  [1] E1507T  E1510T  E1520T ... SHE1796
   getNames(dsC_EC500K_Nsp)
  [1] E1507T  E1510T  E1520T ... SHE1796
 
  it would work as you expect.   The *names* comes directly from the
  *full names*, which by default comes from the *file names* (see above
  link)  Now, you don't have to rename the files to change the full
  names.  Instead you can use so called fullname translator function,
  which will allow you to rename *full names* on the fly, cf.
  http://aroma-project.org/howtos/setFullNamesTranslator.   In your case
  you'll only have to replace the underscores (_) with a comma (,) and
  everything will work.  So, do:
 
  fnt - function(names, ...) gsub(_, ,, names, fixed=TRUE);
  setFullNamesTranslator(dsC_EC500K_Sty, fnt);
  setFullNamesTranslator(dsC_EC500K_Nsp, fnt);
 
  and you should get:
 
   getFullNames(dsC_EC500K_Sty)
  [1] E1507T,STY,total E1510T,STY,total ... SHE1796,STY,total
   getFullNames(dsC_EC500K_Nsp)
  [1] E1507T,Nsp,total E1510T,Nsp,total ... SHE1796,NSP,total
 
  and therefore:
 
   

Re: [aroma.affymetrix] 500K by doCRMAv2

2013-06-06 Thread Wei Tang
Everything goes very well. Nsp and Sty matched.

BUT I forget to ask in which step should I pair the samples with tumor and 
normal. According to , Vignette: Paired total copy-number analysis
cns - CbsModel(dsT, dsN)

now I am using doCBS, how can I connect to the above step?

Many thanks,

Wei


On Wednesday, June 5, 2013 5:50:52 PM UTC-4, Henrik Bengtsson wrote:

 JOn Wed, Jun 5, 2013 at 2:15 PM, Wei Tang tangw...@gmail.comjavascript: 
 wrote: 
  do they need to be the same names in 500K? How about SNP5, they are 
  additional samples. 

 Yes (I did write this in my long initial message).  doCBS() identifies 
 which tuples of arrays belongs to which samples by matching up their 
 *names*, that is, by looking at: 

 getNames(dsC_EC500K_Nsp) 
 getNames(dsC_EC500K_Sty) 

 Note the difference between *names* and *full names* (cf. 
 http://aroma-project.org/definitions/namesAndTags).  In your case the 
 *names* are: 

  getNames(dsC_EC500K_Sty) 
 [1] E1507T_STY  E1510T_STY  E1520T_STY ... SHE1796_STY 
  getNames(dsC_EC500K_Nsp) 
 [1] E1507T_Nsp  E1510T_Nsp  E1520T_Nsp ... SHE1796_NSP 

 Because of this, doCBS() fails to pair them up.  So, yes, if they would 
 be: 

  getNames(dsC_EC500K_Sty) 
 [1] E1507T  E1510T  E1520T ... SHE1796 
  getNames(dsC_EC500K_Nsp) 
 [1] E1507T  E1510T  E1520T ... SHE1796 

 it would work as you expect.   The *names* comes directly from the 
 *full names*, which by default comes from the *file names* (see above 
 link)  Now, you don't have to rename the files to change the full 
 names.  Instead you can use so called fullname translator function, 
 which will allow you to rename *full names* on the fly, cf. 
 http://aroma-project.org/howtos/setFullNamesTranslator.   In your case 
 you'll only have to replace the underscores (_) with a comma (,) and 
 everything will work.  So, do: 

 fnt - function(names, ...) gsub(_, ,, names, fixed=TRUE); 
 setFullNamesTranslator(dsC_EC500K_Sty, fnt); 
 setFullNamesTranslator(dsC_EC500K_Nsp, fnt); 

 and you should get: 

  getFullNames(dsC_EC500K_Sty) 
 [1] E1507T,STY,total E1510T,STY,total ... SHE1796,STY,total 
  getFullNames(dsC_EC500K_Nsp) 
 [1] E1507T,Nsp,total E1510T,Nsp,total ... SHE1796,NSP,total 

 and therefore: 

  getNames(dsC_EC500K_Sty) 
 [1] E1507T E1510T ... SHE1796 
  getNames(dsC_EC500K_Nsp) 
 [1] E1507T E1510T ... SHE1796 

 Then retry with doCBS(). 

 If your GenomeWideSNP_5 arrays have completely different name formats, 
 you have to create a more fancy full names translator function that 
 takes the input names and translates them to match the above. 

 Hope this helps 

 Henrik 

  
  
  On Wednesday, June 5, 2013 5:12:56 PM UTC-4, Wei Tang wrote: 
  
  here you are 
  
   print(getFullNames(dsC_EC500K_Sty)) 
   [1] E1507T_STY,total  E1510T_STY,total  E1520T_STY,total 
   [4] E1521T_STY,total  E1532T_STY,total  E1535T_STY,total 
   [7] E1542T_STY,total  E1546T_STY,total  E1558T_STY,total 
  [10] E1566T_STY,total  E1572T_STY,total  E1573T_STY,total 
  [13] E1575T_STY,total  E1584T_STY,total  E1589T_STY,total 
  [16] E1610T_STY,total  E1635T_STY,total  E1756T_STY,total 
  [19] E1782T_STY,total  E1796T_STY,total  SHE1507_STY,total 
  [22] SHE1510_STY,total SHE1520_STY,total SHE1521_STY,total 
  [25] SHE1532_STY,total SHE1535_STY,total SHE1542_STY,total 
  [28] SHE1546_STY,total SHE1558_STY,total SHE1566_STY,total 
  [31] SHE1572_STY,total SHE1573_STY,total SHE1575_STY,total 
  [34] SHE1584_STY,total SHE1589_STY,total SHE1610_STY,total 
  [37] SHE1635_STY,total SHE1756_STY,total SHE1782_STY,total 
  [40] SHE1796_STY,total 
   print(getFullNames(dsC_EC500K_Nsp)) 
   [1] E1507T_Nsp,total  E1510T_Nsp,total  E1520T_Nsp,total 
   [4] E1521T_Nsp,total  E1532T_Nsp,total  E1535T_Nsp,total 
   [7] E1542T_Nsp,total  E1546T_Nsp,total  E1558T_Nsp,total 
  [10] E1566T_Nsp,total  E1572T_Nsp,total  E1573T_Nsp,total 
  [13] E1575T_Nsp,total  E1584T_Nsp,total  E1589T_Nsp,total 
  [16] E1610T_Nsp,total  E1635T_Nsp,total  E1756T_Nsp,total 
  [19] E1782T_Nsp,total  E1796T_Nsp,total  SHE1507_NSP,total 
  [22] SHE1510_NSP,total SHE1520_NSP,total SHE1521_NSP,total 
  [25] SHE1532_NSP,total SHE1535_NSP,total SHE1542_NSP,total 
  [28] SHE1546_NSP,total SHE1558_NSP,total SHE1566_NSP,total 
  [31] SHE1572_NSP,total SHE1573_NSP,total SHE1575_NSP,total 
  [34] SHE1584_NSP,total SHE1589_NSP,total SHE1610_NSP,total 
  [37] SHE1635_NSP,total SHE1756_NSP,total SHE1782_NSP,total 
  [40] SHE1796_NSP,total 
  
  
  On Wednesday, June 5, 2013 5:01:19 PM UTC-4, Henrik Bengtsson wrote: 
  
  On Wed, Jun 5, 2013 at 1:22 PM, Wei Tang tangw...@gmail.com wrote: 
   Thank you, please see the info below. 
   
   script 
   
   dataSet_500K=EC500K 
   
   
 dsC_EC500K_Sty=doCRMAv2(dataSet_500K,chipType=Mapping250K_Sty,verbose=verbose)
  

   
   
 dsC_EC500K_Nsp=doCRMAv2(dataSet_500K,chipType=Mapping250K_Nsp,verbose=verbose)
  

   
   dataSet=EC500K 
   tags - ACC,-XY,BPN,-XY,RMA,A+B,FLN,-XY ## OR## tags - 
   ACC,-XY,BPN,-XY,AVG,A+B,FLN,-XY 
   res - doCBS(dataSet, 

Re: [aroma.affymetrix] 500K by doCRMAv2

2013-06-05 Thread Wei Tang
Hi Henrik ,

Thank you for you suggestion.

but when I ran 

res - doCBS(dataSet, tags=tags, chipTypes=c(Mapping250K_Nsp, 
Mapping250K_Sty), verbose=verbose); 

it complained 

Error in file(pathname, open = rb) : invalid 'description' argument


do you know how to fix it?

my situation is all paired tumor-normal, 36 paired-samples in SNP5 and 
additional 20 paried-samples in 500K

should I use Multi-source copy-number normalization
and how about using doASCRMAv2, does the usage the same as doCRMAv2 ?; 

Many thanks,

Wei


On Thursday, May 30, 2013 6:05:55 PM UTC-4, Henrik Bengtsson wrote:

 Hi, 

 I've done some updates to the help pages (e.g. ?doCBS), so before 
 anything I recommend to update to aroma.core 2.9.5 and 
 aroma.affymetrix 2.9.4: 

 source(http://aroma-project.org/hbLite.R;); 
 hbInstall(aroma.affymetrix); 


 On Tue, May 28, 2013 at 9:37 AM, Wei Tang tangw...@gmail.comjavascript: 
 wrote: 
  Hi aroma.affymetrix developers, 
  
  Before I start the analysis, I just want to confirm the CN analysis of 
 500K 
  arrays with doCRMAv2, as I did not find a Vig specific about it. 
  
  What I understand is, 
  
  1. run 250K_Nsp 
  dsC_Nsp=doCRMAv2(test,cdf=Nsp,verbose=verbose) 
  
  2. run 250_Sty 
  
  dsC_Sty=doCRMAv2(test,cdf=Sty,verbose=verbose) 

 Yes, you can do CRMAv2 preprocessing for each chip type independently. 
  However, for doCRMAv2() you need to do something like: 

 dsC_Nsp - doCRMAv2(dataSet, chipType=Mapping250K_Nsp, verbose=verbose) 
 dsC_Sty - doCRMAv2(dataSet, chipType=Mapping250K_Sty, verbose=verbose) 

 Chip types have formal and strict names, cf. 
 http://aroma-project.org/definitions/chipTypesAndCDFs 

  
  3. merge them together by aroma.cn 

 Actually, despite its name, you don't need to aroma.cn package here. 
 The basic CBS methods are still in the aroma.core package.  So, after 
 doing the above doCRMAv2() processing, you then want to do something 
 like: 

 tags - ACC,-XY,BPN,-XY,AVG,A+B,FLN,-XY;  # Tags added by CRMAv2 
 res - doCBS(dataSet, tags=tags, chipTypes=c(Mapping250K_Nsp, 
 Mapping250K_Sty), verbose=verbose); 

 It's important that the array *names* of the Mapping250K_Nsp and 
 Mapping250K_Sty pair up, because that is how doCBS() know which array 
 files to pair up/merge in the segmentation.   doCBS() match array 
 names using the names from getNames(), e.g. 

 names_Nsp - getNames(dsC_Nsp); 
 names_Sty - getNames(dsC_Sty); 

 If they don't match up, there are way to change the names so they 
 do, cf. http://aroma-project.org/howtos/setFullNamesTranslator 

  
  Would you mind telling me if I am correct with analysis? 
  
  I also have SNP5.0 to merge, so should I merge 3 arrays at one time or, 
  merge 500K first and then SNP5.0? 

 You can just include them as a third chiptype set above, e.g. 

 res - doCBS(dataSet, tags=tags, chipTypes=c(Mapping250K_Nsp, 
 Mapping250K_Sty, GenomeWideSNP_5), verbose=verbose); 

 Hope this helps/get you started 

 /Henrik 

  
  Thank you very much, 
  
  Wei 
  
  NCI/NIH 
  
  
  
  -- 
  -- 
  When reporting problems on aroma.affymetrix, make sure 1) to run the 
 latest 
  version of the package, 2) to report the output of sessionInfo() and 
  traceback(), and 3) to post a complete code example. 
  
  
  You received this message because you are subscribed to the Google 
 Groups 
  aroma.affymetrix group with website http://www.aroma-project.org/. 
  To post to this group, send email to 
  aroma-af...@googlegroups.comjavascript: 
  To unsubscribe and other options, go to 
 http://www.aroma-project.org/forum/ 
  
  --- 
  You received this message because you are subscribed to the Google 
 Groups 
  aroma.affymetrix group. 
  To unsubscribe from this group and stop receiving emails from it, send 
 an 
  email to aroma-affymetr...@googlegroups.com javascript:. 
  For more options, visit https://groups.google.com/groups/opt_out. 
  
  


-- 
-- 
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
aroma.affymetrix group with website http://www.aroma-project.org/.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe and other options, go to http://www.aroma-project.org/forum/

--- 
You received this message because you are subscribed to the Google Groups 
aroma.affymetrix group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to aroma-affymetrix+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.




Re: [aroma.affymetrix] 500K by doCRMAv2

2013-06-05 Thread Wei Tang
Thank you, please see the info below.

script 

dataSet_500K=EC500K
dsC_EC500K_Sty=doCRMAv2(dataSet_500K,chipType=Mapping250K_Sty,verbose=verbose)
dsC_EC500K_Nsp=doCRMAv2(dataSet_500K,chipType=Mapping250K_Nsp,verbose=verbose)

dataSet=EC500K
tags - ACC,-XY,BPN,-XY,RMA,A+B,FLN,-XY ## OR## tags - 
ACC,-XY,BPN,-XY,AVG,A+B,FLN,-XY
res - doCBS(dataSet, tags=tags, chipTypes=c(Mapping250K_Nsp, 
Mapping250K_Sty), verbose=-10)



 traceback()
43: file(pathname, open = rb)
42: readRawFooter.AromaTabularBinaryFile(this)
41: readRawFooter(this)
40: readFooter.AromaTabularBinaryFile(this)
39: readFooter(this)
38: getChipType.AromaUnitSignalBinaryFile(getOneFile(this), ...)
37: getChipType(getOneFile(this), ...)
36: getChipType.AromaUnitSignalBinarySet(X[[1L]], ...)
35: FUN(X[[1L]], ...)
34: lapply(X = X, FUN = FUN, ...)
33: sapply(res, FUN = getChipType)
32: getSets.AromaMicroarrayDataSetTuple(this)
31: getSets(this)
30: getNames.GenericDataFileSetList(this, ...)
29: getNames(this, ...)
28: length.GenericDataFileSetList(refTuple)
27: length(refTuple)
26: isPaired.CopyNumberChromosomalModel(this)
25: isPaired(this)
24: getAsteriskTags.CopyNumberSegmentationModel(this)
23: getAsteriskTags(this)
22: paste(getAsteriskTags(this)[-1], collapse = ,)
21: getTags.CopyNumberSegmentationModel(this)
20: getTags(this)
19: paste(getTags(this), collapse = ,)
18: paste(Tags:, paste(getTags(this), collapse = ,))
17: as.character.CopyNumberChromosomalModel(x)
16: as.character(x)
15: print(as.character(x))
14: print.Object(...)
13: print(...)
12: eval(expr, envir, enclos)
11: eval(expr, pf)
10: withVisible(eval(expr, pf))
9: evalVis(expr)
8: capture.Verbose(this, print(...), level = level)
7: capture(this, print(...), level = level)
6: print.Verbose(verbose, cbs)
5: print(verbose, cbs)
4: doCBS.CopyNumberDataSetTuple(dsTuple, arrays = arrays, ..., verbose = 
verbose)
3: doCBS(dsTuple, arrays = arrays, ..., verbose = verbose)
2: doCBS.default(dataSet, tags = tags, chipTypes = c(Mapping250K_Nsp,
   Mapping250K_Sty), verbose = -10)
1: doCBS(dataSet, tags = tags, chipTypes = c(Mapping250K_Nsp,
   Mapping250K_Sty), verbose = -10)




 sessionInfo()
R version 3.0.0 (2013-04-03)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
[1] C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
 [1] R.cache_0.6.5  aroma.cn_1.3.3 DNAcopy_1.34.0
 [4] aroma.affymetrix_2.9.4 affxparser_1.32.1  aroma.apd_0.2.3
 [7] R.huge_0.4.1   aroma.light_1.30.2 aroma.core_2.9.5
[10] matrixStats_0.8.1  R.rsp_0.9.6R.devices_2.2.2
[13] R.filesets_2.0.1   R.utils_1.23.2 R.oo_1.13.6
[16] R.methodsS3_1.4.2

loaded via a namespace (and not attached):
[1] PSCBS_0.34.8 digest_0.6.3 tools_3.0.0



On Wednesday, June 5, 2013 3:50:41 PM UTC-4, Henrik Bengtsson wrote:

 Hi. 

 On Wed, Jun 5, 2013 at 11:31 AM, Wei Tang tangw...@gmail.comjavascript: 
 wrote: 
  Hi Henrik , 
  
  Thank you for you suggestion. 
  
  but when I ran 
  
  res - doCBS(dataSet, tags=tags, chipTypes=c(Mapping250K_Nsp, 
  Mapping250K_Sty), verbose=verbose); 
  
  it complained 
   
  Error in file(pathname, open = rb) : invalid 'description' argument 
   
  
  do you know how to fix it? 

 1. What does traceback() output immediately after you get that error? 
 2. Can you show me your complete script? 
 3. What is your sessionInfo()? 

  
  my situation is all paired tumor-normal, 36 paired-samples in SNP5 and 
  additional 20 paried-samples in 500K 
  
  should I use Multi-source copy-number normalization 

 Possibly - depending on the amount of attenuation in the different 
 chip type hybridizations (depends on date, lab etc) you may see a 
 small improvement in power to detect change points.  However, even 
 without doing MSCN it is still always better to merge platforms (as 
 doCBS() does) than running only single chips, cf. Figure 6 in H. 
 Bengtsson, A. Ray, P. Spellman  T.P. Speed, A single-sample method 
 for normalizing and combining full-resolution copy numbers from 
 multiple platforms, labs and analysis methods, Bioinformatics 2009 
 [http://aroma-project.org/publications]. 

  and how about using doASCRMAv2, does the usage the same as doCRMAv2 
 ?; 

 That's if you plan to infer parent-specific CNs.  If you don't know 
 yet, use doASCRMAv2().  Everything should work the same with doCBS(). 

 /Henrik 

  
  Many thanks, 
  
  Wei 
  
  
  On Thursday, May 30, 2013 6:05:55 PM UTC-4, Henrik Bengtsson wrote: 
  
  Hi, 
  
  I've done some updates to the help pages (e.g. ?doCBS), so before 
  anything I recommend to update to aroma.core 2.9.5 and 
  aroma.affymetrix 2.9.4: 
  
  source(http://aroma-project.org/hbLite.R;); 
  hbInstall(aroma.affymetrix); 
  
  
  On Tue, May 28, 2013 at 9:37 AM, Wei Tang tangw...@gmail.com wrote: 
   Hi aroma.affymetrix developers, 
   
   Before I start the analysis, I just want to confirm the CN analysis 
 of 
   500K 
   arrays 

Re: [aroma.affymetrix] 500K by doCRMAv2

2013-06-05 Thread Henrik Bengtsson
On Wed, Jun 5, 2013 at 1:22 PM, Wei Tang tangwei1...@gmail.com wrote:
 Thank you, please see the info below.

 script

 dataSet_500K=EC500K
 dsC_EC500K_Sty=doCRMAv2(dataSet_500K,chipType=Mapping250K_Sty,verbose=verbose)
 dsC_EC500K_Nsp=doCRMAv2(dataSet_500K,chipType=Mapping250K_Nsp,verbose=verbose)

 dataSet=EC500K
 tags - ACC,-XY,BPN,-XY,RMA,A+B,FLN,-XY ## OR## tags -
 ACC,-XY,BPN,-XY,AVG,A+B,FLN,-XY
 res - doCBS(dataSet, tags=tags, chipTypes=c(Mapping250K_Nsp,
 Mapping250K_Sty), verbose=-10)

Would you mind sharing the output of (all) the verbose output from the
doCBS() call?  That would help troubleshooting (I have a guess what's
going on).  It would also be useful to see the output of

print(getFullNames(dsC_EC500K_Sty))
print(getFullNames(dsC_EC500K_Nsp))

If you don't want to share this on the mailing list, you can send it
to me offline.

/Henrik




 traceback()
 43: file(pathname, open = rb)
 42: readRawFooter.AromaTabularBinaryFile(this)
 41: readRawFooter(this)
 40: readFooter.AromaTabularBinaryFile(this)
 39: readFooter(this)
 38: getChipType.AromaUnitSignalBinaryFile(getOneFile(this), ...)
 37: getChipType(getOneFile(this), ...)
 36: getChipType.AromaUnitSignalBinarySet(X[[1L]], ...)
 35: FUN(X[[1L]], ...)
 34: lapply(X = X, FUN = FUN, ...)
 33: sapply(res, FUN = getChipType)
 32: getSets.AromaMicroarrayDataSetTuple(this)
 31: getSets(this)
 30: getNames.GenericDataFileSetList(this, ...)
 29: getNames(this, ...)
 28: length.GenericDataFileSetList(refTuple)
 27: length(refTuple)
 26: isPaired.CopyNumberChromosomalModel(this)
 25: isPaired(this)
 24: getAsteriskTags.CopyNumberSegmentationModel(this)
 23: getAsteriskTags(this)
 22: paste(getAsteriskTags(this)[-1], collapse = ,)
 21: getTags.CopyNumberSegmentationModel(this)
 20: getTags(this)
 19: paste(getTags(this), collapse = ,)
 18: paste(Tags:, paste(getTags(this), collapse = ,))
 17: as.character.CopyNumberChromosomalModel(x)
 16: as.character(x)
 15: print(as.character(x))
 14: print.Object(...)
 13: print(...)
 12: eval(expr, envir, enclos)
 11: eval(expr, pf)
 10: withVisible(eval(expr, pf))
 9: evalVis(expr)
 8: capture.Verbose(this, print(...), level = level)
 7: capture(this, print(...), level = level)
 6: print.Verbose(verbose, cbs)
 5: print(verbose, cbs)
 4: doCBS.CopyNumberDataSetTuple(dsTuple, arrays = arrays, ..., verbose =
 verbose)
 3: doCBS(dsTuple, arrays = arrays, ..., verbose = verbose)
 2: doCBS.default(dataSet, tags = tags, chipTypes = c(Mapping250K_Nsp,
Mapping250K_Sty), verbose = -10)
 1: doCBS(dataSet, tags = tags, chipTypes = c(Mapping250K_Nsp,
Mapping250K_Sty), verbose = -10)




 sessionInfo()
 R version 3.0.0 (2013-04-03)
 Platform: x86_64-unknown-linux-gnu (64-bit)

 locale:
 [1] C

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base

 other attached packages:
  [1] R.cache_0.6.5  aroma.cn_1.3.3 DNAcopy_1.34.0
  [4] aroma.affymetrix_2.9.4 affxparser_1.32.1  aroma.apd_0.2.3
  [7] R.huge_0.4.1   aroma.light_1.30.2 aroma.core_2.9.5
 [10] matrixStats_0.8.1  R.rsp_0.9.6R.devices_2.2.2
 [13] R.filesets_2.0.1   R.utils_1.23.2 R.oo_1.13.6
 [16] R.methodsS3_1.4.2

 loaded via a namespace (and not attached):
 [1] PSCBS_0.34.8 digest_0.6.3 tools_3.0.0



 On Wednesday, June 5, 2013 3:50:41 PM UTC-4, Henrik Bengtsson wrote:

 Hi.

 On Wed, Jun 5, 2013 at 11:31 AM, Wei Tang tangw...@gmail.com wrote:
  Hi Henrik ,
 
  Thank you for you suggestion.
 
  but when I ran
 
  res - doCBS(dataSet, tags=tags, chipTypes=c(Mapping250K_Nsp,
  Mapping250K_Sty), verbose=verbose);
 
  it complained
  
  Error in file(pathname, open = rb) : invalid 'description' argument
  
 
  do you know how to fix it?

 1. What does traceback() output immediately after you get that error?
 2. Can you show me your complete script?
 3. What is your sessionInfo()?

 
  my situation is all paired tumor-normal, 36 paired-samples in SNP5 and
  additional 20 paried-samples in 500K
 
  should I use Multi-source copy-number normalization

 Possibly - depending on the amount of attenuation in the different
 chip type hybridizations (depends on date, lab etc) you may see a
 small improvement in power to detect change points.  However, even
 without doing MSCN it is still always better to merge platforms (as
 doCBS() does) than running only single chips, cf. Figure 6 in H.
 Bengtsson, A. Ray, P. Spellman  T.P. Speed, A single-sample method
 for normalizing and combining full-resolution copy numbers from
 multiple platforms, labs and analysis methods, Bioinformatics 2009
 [http://aroma-project.org/publications].

  and how about using doASCRMAv2, does the usage the same as doCRMAv2
  ?;

 That's if you plan to infer parent-specific CNs.  If you don't know
 yet, use doASCRMAv2().  Everything should work the same with doCBS().

 /Henrik

 
  Many thanks,
 
  Wei
 
 
  On Thursday, May 30, 2013 6:05:55 PM UTC-4, Henrik Bengtsson wrote:
 
  Hi,
 
  I've 

Re: [aroma.affymetrix] 500K by doCRMAv2

2013-06-05 Thread Wei Tang
here you are

 print(getFullNames(dsC_EC500K_Sty))
 [1] E1507T_STY,total  E1510T_STY,total  E1520T_STY,total
 [4] E1521T_STY,total  E1532T_STY,total  E1535T_STY,total
 [7] E1542T_STY,total  E1546T_STY,total  E1558T_STY,total
[10] E1566T_STY,total  E1572T_STY,total  E1573T_STY,total
[13] E1575T_STY,total  E1584T_STY,total  E1589T_STY,total
[16] E1610T_STY,total  E1635T_STY,total  E1756T_STY,total
[19] E1782T_STY,total  E1796T_STY,total  SHE1507_STY,total
[22] SHE1510_STY,total SHE1520_STY,total SHE1521_STY,total
[25] SHE1532_STY,total SHE1535_STY,total SHE1542_STY,total
[28] SHE1546_STY,total SHE1558_STY,total SHE1566_STY,total
[31] SHE1572_STY,total SHE1573_STY,total SHE1575_STY,total
[34] SHE1584_STY,total SHE1589_STY,total SHE1610_STY,total
[37] SHE1635_STY,total SHE1756_STY,total SHE1782_STY,total
[40] SHE1796_STY,total
 print(getFullNames(dsC_EC500K_Nsp))
 [1] E1507T_Nsp,total  E1510T_Nsp,total  E1520T_Nsp,total
 [4] E1521T_Nsp,total  E1532T_Nsp,total  E1535T_Nsp,total
 [7] E1542T_Nsp,total  E1546T_Nsp,total  E1558T_Nsp,total
[10] E1566T_Nsp,total  E1572T_Nsp,total  E1573T_Nsp,total
[13] E1575T_Nsp,total  E1584T_Nsp,total  E1589T_Nsp,total
[16] E1610T_Nsp,total  E1635T_Nsp,total  E1756T_Nsp,total
[19] E1782T_Nsp,total  E1796T_Nsp,total  SHE1507_NSP,total
[22] SHE1510_NSP,total SHE1520_NSP,total SHE1521_NSP,total
[25] SHE1532_NSP,total SHE1535_NSP,total SHE1542_NSP,total
[28] SHE1546_NSP,total SHE1558_NSP,total SHE1566_NSP,total
[31] SHE1572_NSP,total SHE1573_NSP,total SHE1575_NSP,total
[34] SHE1584_NSP,total SHE1589_NSP,total SHE1610_NSP,total
[37] SHE1635_NSP,total SHE1756_NSP,total SHE1782_NSP,total
[40] SHE1796_NSP,total


On Wednesday, June 5, 2013 5:01:19 PM UTC-4, Henrik Bengtsson wrote:

 On Wed, Jun 5, 2013 at 1:22 PM, Wei Tang tangw...@gmail.com javascript: 
 wrote: 
  Thank you, please see the info below. 
  
  script 
  
  dataSet_500K=EC500K 
  
 dsC_EC500K_Sty=doCRMAv2(dataSet_500K,chipType=Mapping250K_Sty,verbose=verbose)
  

  
 dsC_EC500K_Nsp=doCRMAv2(dataSet_500K,chipType=Mapping250K_Nsp,verbose=verbose)
  

  
  dataSet=EC500K 
  tags - ACC,-XY,BPN,-XY,RMA,A+B,FLN,-XY ## OR## tags - 
  ACC,-XY,BPN,-XY,AVG,A+B,FLN,-XY 
  res - doCBS(dataSet, tags=tags, chipTypes=c(Mapping250K_Nsp, 
  Mapping250K_Sty), verbose=-10) 

 Would you mind sharing the output of (all) the verbose output from the 
 doCBS() call?  That would help troubleshooting (I have a guess what's 
 going on).  It would also be useful to see the output of 

 print(getFullNames(dsC_EC500K_Sty)) 
 print(getFullNames(dsC_EC500K_Nsp)) 

 If you don't want to share this on the mailing list, you can send it 
 to me offline. 

 /Henrik 

  
  
  
  traceback() 
  43: file(pathname, open = rb) 
  42: readRawFooter.AromaTabularBinaryFile(this) 
  41: readRawFooter(this) 
  40: readFooter.AromaTabularBinaryFile(this) 
  39: readFooter(this) 
  38: getChipType.AromaUnitSignalBinaryFile(getOneFile(this), ...) 
  37: getChipType(getOneFile(this), ...) 
  36: getChipType.AromaUnitSignalBinarySet(X[[1L]], ...) 
  35: FUN(X[[1L]], ...) 
  34: lapply(X = X, FUN = FUN, ...) 
  33: sapply(res, FUN = getChipType) 
  32: getSets.AromaMicroarrayDataSetTuple(this) 
  31: getSets(this) 
  30: getNames.GenericDataFileSetList(this, ...) 
  29: getNames(this, ...) 
  28: length.GenericDataFileSetList(refTuple) 
  27: length(refTuple) 
  26: isPaired.CopyNumberChromosomalModel(this) 
  25: isPaired(this) 
  24: getAsteriskTags.CopyNumberSegmentationModel(this) 
  23: getAsteriskTags(this) 
  22: paste(getAsteriskTags(this)[-1], collapse = ,) 
  21: getTags.CopyNumberSegmentationModel(this) 
  20: getTags(this) 
  19: paste(getTags(this), collapse = ,) 
  18: paste(Tags:, paste(getTags(this), collapse = ,)) 
  17: as.character.CopyNumberChromosomalModel(x) 
  16: as.character(x) 
  15: print(as.character(x)) 
  14: print.Object(...) 
  13: print(...) 
  12: eval(expr, envir, enclos) 
  11: eval(expr, pf) 
  10: withVisible(eval(expr, pf)) 
  9: evalVis(expr) 
  8: capture.Verbose(this, print(...), level = level) 
  7: capture(this, print(...), level = level) 
  6: print.Verbose(verbose, cbs) 
  5: print(verbose, cbs) 
  4: doCBS.CopyNumberDataSetTuple(dsTuple, arrays = arrays, ..., verbose = 
  verbose) 
  3: doCBS(dsTuple, arrays = arrays, ..., verbose = verbose) 
  2: doCBS.default(dataSet, tags = tags, chipTypes = c(Mapping250K_Nsp, 
 Mapping250K_Sty), verbose = -10) 
  1: doCBS(dataSet, tags = tags, chipTypes = c(Mapping250K_Nsp, 
 Mapping250K_Sty), verbose = -10) 
  
  
  
  
  sessionInfo() 
  R version 3.0.0 (2013-04-03) 
  Platform: x86_64-unknown-linux-gnu (64-bit) 
  
  locale: 
  [1] C 
  
  attached base packages: 
  [1] stats graphics  grDevices utils datasets  methods   base 
  
  other attached packages: 
   [1] R.cache_0.6.5  aroma.cn_1.3.3 DNAcopy_1.34.0 
   [4] aroma.affymetrix_2.9.4 affxparser_1.32.1  aroma.apd_0.2.3 
   [7] R.huge_0.4.1   aroma.light_1.30.2 aroma.core_2.9.5 
  

Re: [aroma.affymetrix] 500K by doCRMAv2

2013-06-05 Thread Wei Tang
do they need to be the same names in 500K? How about SNP5, they are 
additional samples.

On Wednesday, June 5, 2013 5:12:56 PM UTC-4, Wei Tang wrote:

 here you are

  print(getFullNames(dsC_EC500K_Sty))
  [1] E1507T_STY,total  E1510T_STY,total  E1520T_STY,total
  [4] E1521T_STY,total  E1532T_STY,total  E1535T_STY,total
  [7] E1542T_STY,total  E1546T_STY,total  E1558T_STY,total
 [10] E1566T_STY,total  E1572T_STY,total  E1573T_STY,total
 [13] E1575T_STY,total  E1584T_STY,total  E1589T_STY,total
 [16] E1610T_STY,total  E1635T_STY,total  E1756T_STY,total
 [19] E1782T_STY,total  E1796T_STY,total  SHE1507_STY,total
 [22] SHE1510_STY,total SHE1520_STY,total SHE1521_STY,total
 [25] SHE1532_STY,total SHE1535_STY,total SHE1542_STY,total
 [28] SHE1546_STY,total SHE1558_STY,total SHE1566_STY,total
 [31] SHE1572_STY,total SHE1573_STY,total SHE1575_STY,total
 [34] SHE1584_STY,total SHE1589_STY,total SHE1610_STY,total
 [37] SHE1635_STY,total SHE1756_STY,total SHE1782_STY,total
 [40] SHE1796_STY,total
  print(getFullNames(dsC_EC500K_Nsp))
  [1] E1507T_Nsp,total  E1510T_Nsp,total  E1520T_Nsp,total
  [4] E1521T_Nsp,total  E1532T_Nsp,total  E1535T_Nsp,total
  [7] E1542T_Nsp,total  E1546T_Nsp,total  E1558T_Nsp,total
 [10] E1566T_Nsp,total  E1572T_Nsp,total  E1573T_Nsp,total
 [13] E1575T_Nsp,total  E1584T_Nsp,total  E1589T_Nsp,total
 [16] E1610T_Nsp,total  E1635T_Nsp,total  E1756T_Nsp,total
 [19] E1782T_Nsp,total  E1796T_Nsp,total  SHE1507_NSP,total
 [22] SHE1510_NSP,total SHE1520_NSP,total SHE1521_NSP,total
 [25] SHE1532_NSP,total SHE1535_NSP,total SHE1542_NSP,total
 [28] SHE1546_NSP,total SHE1558_NSP,total SHE1566_NSP,total
 [31] SHE1572_NSP,total SHE1573_NSP,total SHE1575_NSP,total
 [34] SHE1584_NSP,total SHE1589_NSP,total SHE1610_NSP,total
 [37] SHE1635_NSP,total SHE1756_NSP,total SHE1782_NSP,total
 [40] SHE1796_NSP,total


 On Wednesday, June 5, 2013 5:01:19 PM UTC-4, Henrik Bengtsson wrote:

 On Wed, Jun 5, 2013 at 1:22 PM, Wei Tang tangw...@gmail.com wrote: 
  Thank you, please see the info below. 
  
  script 
  
  dataSet_500K=EC500K 
  
 dsC_EC500K_Sty=doCRMAv2(dataSet_500K,chipType=Mapping250K_Sty,verbose=verbose)
  

  
 dsC_EC500K_Nsp=doCRMAv2(dataSet_500K,chipType=Mapping250K_Nsp,verbose=verbose)
  

  
  dataSet=EC500K 
  tags - ACC,-XY,BPN,-XY,RMA,A+B,FLN,-XY ## OR## tags - 
  ACC,-XY,BPN,-XY,AVG,A+B,FLN,-XY 
  res - doCBS(dataSet, tags=tags, chipTypes=c(Mapping250K_Nsp, 
  Mapping250K_Sty), verbose=-10) 

 Would you mind sharing the output of (all) the verbose output from the 
 doCBS() call?  That would help troubleshooting (I have a guess what's 
 going on).  It would also be useful to see the output of 

 print(getFullNames(dsC_EC500K_Sty)) 
 print(getFullNames(dsC_EC500K_Nsp)) 

 If you don't want to share this on the mailing list, you can send it 
 to me offline. 

 /Henrik 

  
  
  
  traceback() 
  43: file(pathname, open = rb) 
  42: readRawFooter.AromaTabularBinaryFile(this) 
  41: readRawFooter(this) 
  40: readFooter.AromaTabularBinaryFile(this) 
  39: readFooter(this) 
  38: getChipType.AromaUnitSignalBinaryFile(getOneFile(this), ...) 
  37: getChipType(getOneFile(this), ...) 
  36: getChipType.AromaUnitSignalBinarySet(X[[1L]], ...) 
  35: FUN(X[[1L]], ...) 
  34: lapply(X = X, FUN = FUN, ...) 
  33: sapply(res, FUN = getChipType) 
  32: getSets.AromaMicroarrayDataSetTuple(this) 
  31: getSets(this) 
  30: getNames.GenericDataFileSetList(this, ...) 
  29: getNames(this, ...) 
  28: length.GenericDataFileSetList(refTuple) 
  27: length(refTuple) 
  26: isPaired.CopyNumberChromosomalModel(this) 
  25: isPaired(this) 
  24: getAsteriskTags.CopyNumberSegmentationModel(this) 
  23: getAsteriskTags(this) 
  22: paste(getAsteriskTags(this)[-1], collapse = ,) 
  21: getTags.CopyNumberSegmentationModel(this) 
  20: getTags(this) 
  19: paste(getTags(this), collapse = ,) 
  18: paste(Tags:, paste(getTags(this), collapse = ,)) 
  17: as.character.CopyNumberChromosomalModel(x) 
  16: as.character(x) 
  15: print(as.character(x)) 
  14: print.Object(...) 
  13: print(...) 
  12: eval(expr, envir, enclos) 
  11: eval(expr, pf) 
  10: withVisible(eval(expr, pf)) 
  9: evalVis(expr) 
  8: capture.Verbose(this, print(...), level = level) 
  7: capture(this, print(...), level = level) 
  6: print.Verbose(verbose, cbs) 
  5: print(verbose, cbs) 
  4: doCBS.CopyNumberDataSetTuple(dsTuple, arrays = arrays, ..., verbose 
 = 
  verbose) 
  3: doCBS(dsTuple, arrays = arrays, ..., verbose = verbose) 
  2: doCBS.default(dataSet, tags = tags, chipTypes = c(Mapping250K_Nsp, 
 Mapping250K_Sty), verbose = -10) 
  1: doCBS(dataSet, tags = tags, chipTypes = c(Mapping250K_Nsp, 
 Mapping250K_Sty), verbose = -10) 
  
  
  
  
  sessionInfo() 
  R version 3.0.0 (2013-04-03) 
  Platform: x86_64-unknown-linux-gnu (64-bit) 
  
  locale: 
  [1] C 
  
  attached base packages: 
  [1] stats graphics  grDevices utils datasets  methods   base 
  
  other attached packages: 
   [1] R.cache_0.6.5  

Re: [aroma.affymetrix] 500K by doCRMAv2

2013-06-05 Thread Henrik Bengtsson
JOn Wed, Jun 5, 2013 at 2:15 PM, Wei Tang tangwei1...@gmail.com wrote:
 do they need to be the same names in 500K? How about SNP5, they are
 additional samples.

Yes (I did write this in my long initial message).  doCBS() identifies
which tuples of arrays belongs to which samples by matching up their
*names*, that is, by looking at:

getNames(dsC_EC500K_Nsp)
getNames(dsC_EC500K_Sty)

Note the difference between *names* and *full names* (cf.
http://aroma-project.org/definitions/namesAndTags).  In your case the
*names* are:

 getNames(dsC_EC500K_Sty)
[1] E1507T_STY  E1510T_STY  E1520T_STY ... SHE1796_STY
 getNames(dsC_EC500K_Nsp)
[1] E1507T_Nsp  E1510T_Nsp  E1520T_Nsp ... SHE1796_NSP

Because of this, doCBS() fails to pair them up.  So, yes, if they would be:

 getNames(dsC_EC500K_Sty)
[1] E1507T  E1510T  E1520T ... SHE1796
 getNames(dsC_EC500K_Nsp)
[1] E1507T  E1510T  E1520T ... SHE1796

it would work as you expect.   The *names* comes directly from the
*full names*, which by default comes from the *file names* (see above
link)  Now, you don't have to rename the files to change the full
names.  Instead you can use so called fullname translator function,
which will allow you to rename *full names* on the fly, cf.
http://aroma-project.org/howtos/setFullNamesTranslator.   In your case
you'll only have to replace the underscores (_) with a comma (,) and
everything will work.  So, do:

fnt - function(names, ...) gsub(_, ,, names, fixed=TRUE);
setFullNamesTranslator(dsC_EC500K_Sty, fnt);
setFullNamesTranslator(dsC_EC500K_Nsp, fnt);

and you should get:

 getFullNames(dsC_EC500K_Sty)
[1] E1507T,STY,total E1510T,STY,total ... SHE1796,STY,total
 getFullNames(dsC_EC500K_Nsp)
[1] E1507T,Nsp,total E1510T,Nsp,total ... SHE1796,NSP,total

and therefore:

 getNames(dsC_EC500K_Sty)
[1] E1507T E1510T ... SHE1796
 getNames(dsC_EC500K_Nsp)
[1] E1507T E1510T ... SHE1796

Then retry with doCBS().

If your GenomeWideSNP_5 arrays have completely different name formats,
you have to create a more fancy full names translator function that
takes the input names and translates them to match the above.

Hope this helps

Henrik



 On Wednesday, June 5, 2013 5:12:56 PM UTC-4, Wei Tang wrote:

 here you are

  print(getFullNames(dsC_EC500K_Sty))
  [1] E1507T_STY,total  E1510T_STY,total  E1520T_STY,total
  [4] E1521T_STY,total  E1532T_STY,total  E1535T_STY,total
  [7] E1542T_STY,total  E1546T_STY,total  E1558T_STY,total
 [10] E1566T_STY,total  E1572T_STY,total  E1573T_STY,total
 [13] E1575T_STY,total  E1584T_STY,total  E1589T_STY,total
 [16] E1610T_STY,total  E1635T_STY,total  E1756T_STY,total
 [19] E1782T_STY,total  E1796T_STY,total  SHE1507_STY,total
 [22] SHE1510_STY,total SHE1520_STY,total SHE1521_STY,total
 [25] SHE1532_STY,total SHE1535_STY,total SHE1542_STY,total
 [28] SHE1546_STY,total SHE1558_STY,total SHE1566_STY,total
 [31] SHE1572_STY,total SHE1573_STY,total SHE1575_STY,total
 [34] SHE1584_STY,total SHE1589_STY,total SHE1610_STY,total
 [37] SHE1635_STY,total SHE1756_STY,total SHE1782_STY,total
 [40] SHE1796_STY,total
  print(getFullNames(dsC_EC500K_Nsp))
  [1] E1507T_Nsp,total  E1510T_Nsp,total  E1520T_Nsp,total
  [4] E1521T_Nsp,total  E1532T_Nsp,total  E1535T_Nsp,total
  [7] E1542T_Nsp,total  E1546T_Nsp,total  E1558T_Nsp,total
 [10] E1566T_Nsp,total  E1572T_Nsp,total  E1573T_Nsp,total
 [13] E1575T_Nsp,total  E1584T_Nsp,total  E1589T_Nsp,total
 [16] E1610T_Nsp,total  E1635T_Nsp,total  E1756T_Nsp,total
 [19] E1782T_Nsp,total  E1796T_Nsp,total  SHE1507_NSP,total
 [22] SHE1510_NSP,total SHE1520_NSP,total SHE1521_NSP,total
 [25] SHE1532_NSP,total SHE1535_NSP,total SHE1542_NSP,total
 [28] SHE1546_NSP,total SHE1558_NSP,total SHE1566_NSP,total
 [31] SHE1572_NSP,total SHE1573_NSP,total SHE1575_NSP,total
 [34] SHE1584_NSP,total SHE1589_NSP,total SHE1610_NSP,total
 [37] SHE1635_NSP,total SHE1756_NSP,total SHE1782_NSP,total
 [40] SHE1796_NSP,total


 On Wednesday, June 5, 2013 5:01:19 PM UTC-4, Henrik Bengtsson wrote:

 On Wed, Jun 5, 2013 at 1:22 PM, Wei Tang tangw...@gmail.com wrote:
  Thank you, please see the info below.
 
  script
 
  dataSet_500K=EC500K
 
  dsC_EC500K_Sty=doCRMAv2(dataSet_500K,chipType=Mapping250K_Sty,verbose=verbose)
 
  dsC_EC500K_Nsp=doCRMAv2(dataSet_500K,chipType=Mapping250K_Nsp,verbose=verbose)
 
  dataSet=EC500K
  tags - ACC,-XY,BPN,-XY,RMA,A+B,FLN,-XY ## OR## tags -
  ACC,-XY,BPN,-XY,AVG,A+B,FLN,-XY
  res - doCBS(dataSet, tags=tags, chipTypes=c(Mapping250K_Nsp,
  Mapping250K_Sty), verbose=-10)

 Would you mind sharing the output of (all) the verbose output from the
 doCBS() call?  That would help troubleshooting (I have a guess what's
 going on).  It would also be useful to see the output of

 print(getFullNames(dsC_EC500K_Sty))
 print(getFullNames(dsC_EC500K_Nsp))

 If you don't want to share this on the mailing list, you can send it
 to me offline.

 /Henrik

 
 
 
  traceback()
  43: file(pathname, open = rb)
  42: readRawFooter.AromaTabularBinaryFile(this)
  41: readRawFooter(this)
  40: 

Re: [aroma.affymetrix] 500K by doCRMAv2

2013-06-05 Thread Wei Tang
Thank you so much for your guidance and really smart move for name 
translator.

I think I should be able to make it this time.



On Wednesday, June 5, 2013 5:50:52 PM UTC-4, Henrik Bengtsson wrote:

 JOn Wed, Jun 5, 2013 at 2:15 PM, Wei Tang tangw...@gmail.comjavascript: 
 wrote: 
  do they need to be the same names in 500K? How about SNP5, they are 
  additional samples. 

 Yes (I did write this in my long initial message).  doCBS() identifies 
 which tuples of arrays belongs to which samples by matching up their 
 *names*, that is, by looking at: 

 getNames(dsC_EC500K_Nsp) 
 getNames(dsC_EC500K_Sty) 

 Note the difference between *names* and *full names* (cf. 
 http://aroma-project.org/definitions/namesAndTags).  In your case the 
 *names* are: 

  getNames(dsC_EC500K_Sty) 
 [1] E1507T_STY  E1510T_STY  E1520T_STY ... SHE1796_STY 
  getNames(dsC_EC500K_Nsp) 
 [1] E1507T_Nsp  E1510T_Nsp  E1520T_Nsp ... SHE1796_NSP 

 Because of this, doCBS() fails to pair them up.  So, yes, if they would 
 be: 

  getNames(dsC_EC500K_Sty) 
 [1] E1507T  E1510T  E1520T ... SHE1796 
  getNames(dsC_EC500K_Nsp) 
 [1] E1507T  E1510T  E1520T ... SHE1796 

 it would work as you expect.   The *names* comes directly from the 
 *full names*, which by default comes from the *file names* (see above 
 link)  Now, you don't have to rename the files to change the full 
 names.  Instead you can use so called fullname translator function, 
 which will allow you to rename *full names* on the fly, cf. 
 http://aroma-project.org/howtos/setFullNamesTranslator.   In your case 
 you'll only have to replace the underscores (_) with a comma (,) and 
 everything will work.  So, do: 

 fnt - function(names, ...) gsub(_, ,, names, fixed=TRUE); 
 setFullNamesTranslator(dsC_EC500K_Sty, fnt); 
 setFullNamesTranslator(dsC_EC500K_Nsp, fnt); 

 and you should get: 

  getFullNames(dsC_EC500K_Sty) 
 [1] E1507T,STY,total E1510T,STY,total ... SHE1796,STY,total 
  getFullNames(dsC_EC500K_Nsp) 
 [1] E1507T,Nsp,total E1510T,Nsp,total ... SHE1796,NSP,total 

 and therefore: 

  getNames(dsC_EC500K_Sty) 
 [1] E1507T E1510T ... SHE1796 
  getNames(dsC_EC500K_Nsp) 
 [1] E1507T E1510T ... SHE1796 

 Then retry with doCBS(). 

 If your GenomeWideSNP_5 arrays have completely different name formats, 
 you have to create a more fancy full names translator function that 
 takes the input names and translates them to match the above. 

 Hope this helps 

 Henrik 

  
  
  On Wednesday, June 5, 2013 5:12:56 PM UTC-4, Wei Tang wrote: 
  
  here you are 
  
   print(getFullNames(dsC_EC500K_Sty)) 
   [1] E1507T_STY,total  E1510T_STY,total  E1520T_STY,total 
   [4] E1521T_STY,total  E1532T_STY,total  E1535T_STY,total 
   [7] E1542T_STY,total  E1546T_STY,total  E1558T_STY,total 
  [10] E1566T_STY,total  E1572T_STY,total  E1573T_STY,total 
  [13] E1575T_STY,total  E1584T_STY,total  E1589T_STY,total 
  [16] E1610T_STY,total  E1635T_STY,total  E1756T_STY,total 
  [19] E1782T_STY,total  E1796T_STY,total  SHE1507_STY,total 
  [22] SHE1510_STY,total SHE1520_STY,total SHE1521_STY,total 
  [25] SHE1532_STY,total SHE1535_STY,total SHE1542_STY,total 
  [28] SHE1546_STY,total SHE1558_STY,total SHE1566_STY,total 
  [31] SHE1572_STY,total SHE1573_STY,total SHE1575_STY,total 
  [34] SHE1584_STY,total SHE1589_STY,total SHE1610_STY,total 
  [37] SHE1635_STY,total SHE1756_STY,total SHE1782_STY,total 
  [40] SHE1796_STY,total 
   print(getFullNames(dsC_EC500K_Nsp)) 
   [1] E1507T_Nsp,total  E1510T_Nsp,total  E1520T_Nsp,total 
   [4] E1521T_Nsp,total  E1532T_Nsp,total  E1535T_Nsp,total 
   [7] E1542T_Nsp,total  E1546T_Nsp,total  E1558T_Nsp,total 
  [10] E1566T_Nsp,total  E1572T_Nsp,total  E1573T_Nsp,total 
  [13] E1575T_Nsp,total  E1584T_Nsp,total  E1589T_Nsp,total 
  [16] E1610T_Nsp,total  E1635T_Nsp,total  E1756T_Nsp,total 
  [19] E1782T_Nsp,total  E1796T_Nsp,total  SHE1507_NSP,total 
  [22] SHE1510_NSP,total SHE1520_NSP,total SHE1521_NSP,total 
  [25] SHE1532_NSP,total SHE1535_NSP,total SHE1542_NSP,total 
  [28] SHE1546_NSP,total SHE1558_NSP,total SHE1566_NSP,total 
  [31] SHE1572_NSP,total SHE1573_NSP,total SHE1575_NSP,total 
  [34] SHE1584_NSP,total SHE1589_NSP,total SHE1610_NSP,total 
  [37] SHE1635_NSP,total SHE1756_NSP,total SHE1782_NSP,total 
  [40] SHE1796_NSP,total 
  
  
  On Wednesday, June 5, 2013 5:01:19 PM UTC-4, Henrik Bengtsson wrote: 
  
  On Wed, Jun 5, 2013 at 1:22 PM, Wei Tang tangw...@gmail.com wrote: 
   Thank you, please see the info below. 
   
   script 
   
   dataSet_500K=EC500K 
   
   
 dsC_EC500K_Sty=doCRMAv2(dataSet_500K,chipType=Mapping250K_Sty,verbose=verbose)
  

   
   
 dsC_EC500K_Nsp=doCRMAv2(dataSet_500K,chipType=Mapping250K_Nsp,verbose=verbose)
  

   
   dataSet=EC500K 
   tags - ACC,-XY,BPN,-XY,RMA,A+B,FLN,-XY ## OR## tags - 
   ACC,-XY,BPN,-XY,AVG,A+B,FLN,-XY 
   res - doCBS(dataSet, tags=tags, chipTypes=c(Mapping250K_Nsp, 
   Mapping250K_Sty), verbose=-10) 
  
  Would you mind sharing the output of (all) the verbose output from the 
  doCBS() call?