Re: [aroma.affymetrix] 500K by doCRMAv2

Henrik Bengtsson Wed, 05 Jun 2013 14:52:30 -0700

JOn Wed, Jun 5, 2013 at 2:15 PM, Wei Tang <tangwei1...@gmail.com> wrote:
> do they need to be the same names in 500K? How about SNP5, they are
> additional samples.


Yes (I did write this in my long initial message).  doCBS() identifies
which tuples of arrays belongs to which samples by matching up their
*names*, that is, by looking at:

getNames(dsC_EC500K_Nsp)
getNames(dsC_EC500K_Sty)

Note the difference between *names* and *full names* (cf.
http://aroma-project.org/definitions/namesAndTags).  In your case the
*names* are:

> getNames(dsC_EC500K_Sty)
[1] "E1507T_STY"  "E1510T_STY"  "E1520T_STY" ... "SHE1796_STY"
> getNames(dsC_EC500K_Nsp)
[1] "E1507T_Nsp"  "E1510T_Nsp  "E1520T_Nsp" ... "SHE1796_NSP"

Because of this, doCBS() fails to pair them up.  So, yes, if they would be:

> getNames(dsC_EC500K_Sty)
[1] "E1507T"  "E1510T"  "E1520T" ... "SHE1796"
> getNames(dsC_EC500K_Nsp)
[1] "E1507T"  "E1510T  "E1520T" ... "SHE1796"

it would work as you expect.   The *names* comes directly from the
*full names*, which by default comes from the *file names* (see above
link)  Now, you don't have to rename the files to change the full
names.  Instead you can use so called fullname translator function,
which will allow you to rename *full names* "on the fly", cf.
http://aroma-project.org/howtos/setFullNamesTranslator.   In your case
you'll only have to replace the underscores (_) with a comma (,) and
everything will work.  So, do:

fnt <- function(names, ...) gsub("_", ",", names, fixed=TRUE);
setFullNamesTranslator(dsC_EC500K_Sty, fnt);
setFullNamesTranslator(dsC_EC500K_Nsp, fnt);

and you should get:

> getFullNames(dsC_EC500K_Sty)
[1] "E1507T,STY,total" "E1510T,STY,total" ... "SHE1796,STY,total"
> getFullNames(dsC_EC500K_Nsp)
[1] "E1507T,Nsp,total" "E1510T,Nsp,total" ... "SHE1796,NSP,total"

and therefore:

> getNames(dsC_EC500K_Sty)
[1] "E1507T" "E1510T" ... "SHE1796"
> getNames(dsC_EC500K_Nsp)
[1] "E1507T" "E1510T" ... "SHE1796"

Then retry with doCBS().

If your GenomeWideSNP_5 arrays have completely different name formats,
you have to create a more fancy full names translator function that
takes the input names and translates them to match the above.

Hope this helps

Henrik

>
>
> On Wednesday, June 5, 2013 5:12:56 PM UTC-4, Wei Tang wrote:
>>
>> here you are
>>
>> > print(getFullNames(dsC_EC500K_Sty))
>>  [1] "E1507T_STY,total"  "E1510T_STY,total"  "E1520T_STY,total"
>>  [4] "E1521T_STY,total"  "E1532T_STY,total"  "E1535T_STY,total"
>>  [7] "E1542T_STY,total"  "E1546T_STY,total"  "E1558T_STY,total"
>> [10] "E1566T_STY,total"  "E1572T_STY,total"  "E1573T_STY,total"
>> [13] "E1575T_STY,total"  "E1584T_STY,total"  "E1589T_STY,total"
>> [16] "E1610T_STY,total"  "E1635T_STY,total"  "E1756T_STY,total"
>> [19] "E1782T_STY,total"  "E1796T_STY,total"  "SHE1507_STY,total"
>> [22] "SHE1510_STY,total" "SHE1520_STY,total" "SHE1521_STY,total"
>> [25] "SHE1532_STY,total" "SHE1535_STY,total" "SHE1542_STY,total"
>> [28] "SHE1546_STY,total" "SHE1558_STY,total" "SHE1566_STY,total"
>> [31] "SHE1572_STY,total" "SHE1573_STY,total" "SHE1575_STY,total"
>> [34] "SHE1584_STY,total" "SHE1589_STY,total" "SHE1610_STY,total"
>> [37] "SHE1635_STY,total" "SHE1756_STY,total" "SHE1782_STY,total"
>> [40] "SHE1796_STY,total"
>> > print(getFullNames(dsC_EC500K_Nsp))
>>  [1] "E1507T_Nsp,total"  "E1510T_Nsp,total"  "E1520T_Nsp,total"
>>  [4] "E1521T_Nsp,total"  "E1532T_Nsp,total"  "E1535T_Nsp,total"
>>  [7] "E1542T_Nsp,total"  "E1546T_Nsp,total"  "E1558T_Nsp,total"
>> [10] "E1566T_Nsp,total"  "E1572T_Nsp,total"  "E1573T_Nsp,total"
>> [13] "E1575T_Nsp,total"  "E1584T_Nsp,total"  "E1589T_Nsp,total"
>> [16] "E1610T_Nsp,total"  "E1635T_Nsp,total"  "E1756T_Nsp,total"
>> [19] "E1782T_Nsp,total"  "E1796T_Nsp,total"  "SHE1507_NSP,total"
>> [22] "SHE1510_NSP,total" "SHE1520_NSP,total" "SHE1521_NSP,total"
>> [25] "SHE1532_NSP,total" "SHE1535_NSP,total" "SHE1542_NSP,total"
>> [28] "SHE1546_NSP,total" "SHE1558_NSP,total" "SHE1566_NSP,total"
>> [31] "SHE1572_NSP,total" "SHE1573_NSP,total" "SHE1575_NSP,total"
>> [34] "SHE1584_NSP,total" "SHE1589_NSP,total" "SHE1610_NSP,total"
>> [37] "SHE1635_NSP,total" "SHE1756_NSP,total" "SHE1782_NSP,total"
>> [40] "SHE1796_NSP,total"
>>
>>
>> On Wednesday, June 5, 2013 5:01:19 PM UTC-4, Henrik Bengtsson wrote:
>>>
>>> On Wed, Jun 5, 2013 at 1:22 PM, Wei Tang <tangw...@gmail.com> wrote:
>>> > Thank you, please see the info below.
>>> >
>>> > script
>>> >
>>> > dataSet_500K="EC500K"
>>> >
>>> > dsC_EC500K_Sty=doCRMAv2(dataSet_500K,chipType="Mapping250K_Sty",verbose=verbose)
>>> >
>>> > dsC_EC500K_Nsp=doCRMAv2(dataSet_500K,chipType="Mapping250K_Nsp",verbose=verbose)
>>> >
>>> > dataSet="EC500K"
>>> > tags <- "ACC,-XY,BPN,-XY,RMA,A+B,FLN,-XY" ## OR## tags <-
>>> > "ACC,-XY,BPN,-XY,AVG,A+B,FLN,-XY"
>>> > res <- doCBS(dataSet, tags=tags, chipTypes=c("Mapping250K_Nsp",
>>> > "Mapping250K_Sty"), verbose=-10)
>>>
>>> Would you mind sharing the output of (all) the verbose output from the
>>> doCBS() call?  That would help troubleshooting (I have a guess what's
>>> going on).  It would also be useful to see the output of
>>>
>>> print(getFullNames(dsC_EC500K_Sty))
>>> print(getFullNames(dsC_EC500K_Nsp))
>>>
>>> If you don't want to share this on the mailing list, you can send it
>>> to me offline.
>>>
>>> /Henrik
>>>
>>> >
>>> >
>>> >
>>> >> traceback()
>>> > 43: file(pathname, open = "rb")
>>> > 42: readRawFooter.AromaTabularBinaryFile(this)
>>> > 41: readRawFooter(this)
>>> > 40: readFooter.AromaTabularBinaryFile(this)
>>> > 39: readFooter(this)
>>> > 38: getChipType.AromaUnitSignalBinaryFile(getOneFile(this), ...)
>>> > 37: getChipType(getOneFile(this), ...)
>>> > 36: getChipType.AromaUnitSignalBinarySet(X[[1L]], ...)
>>> > 35: FUN(X[[1L]], ...)
>>> > 34: lapply(X = X, FUN = FUN, ...)
>>> > 33: sapply(res, FUN = getChipType)
>>> > 32: getSets.AromaMicroarrayDataSetTuple(this)
>>> > 31: getSets(this)
>>> > 30: getNames.GenericDataFileSetList(this, ...)
>>> > 29: getNames(this, ...)
>>> > 28: length.GenericDataFileSetList(refTuple)
>>> > 27: length(refTuple)
>>> > 26: isPaired.CopyNumberChromosomalModel(this)
>>> > 25: isPaired(this)
>>> > 24: getAsteriskTags.CopyNumberSegmentationModel(this)
>>> > 23: getAsteriskTags(this)
>>> > 22: paste(getAsteriskTags(this)[-1], collapse = ",")
>>> > 21: getTags.CopyNumberSegmentationModel(this)
>>> > 20: getTags(this)
>>> > 19: paste(getTags(this), collapse = ",")
>>> > 18: paste("Tags:", paste(getTags(this), collapse = ","))
>>> > 17: as.character.CopyNumberChromosomalModel(x)
>>> > 16: as.character(x)
>>> > 15: print(as.character(x))
>>> > 14: print.Object(...)
>>> > 13: print(...)
>>> > 12: eval(expr, envir, enclos)
>>> > 11: eval(expr, pf)
>>> > 10: withVisible(eval(expr, pf))
>>> > 9: evalVis(expr)
>>> > 8: capture.Verbose(this, print(...), level = level)
>>> > 7: capture(this, print(...), level = level)
>>> > 6: print.Verbose(verbose, cbs)
>>> > 5: print(verbose, cbs)
>>> > 4: doCBS.CopyNumberDataSetTuple(dsTuple, arrays = arrays, ..., verbose
>>> > =
>>> > verbose)
>>> > 3: doCBS(dsTuple, arrays = arrays, ..., verbose = verbose)
>>> > 2: doCBS.default(dataSet, tags = tags, chipTypes = c("Mapping250K_Nsp",
>>> >        "Mapping250K_Sty"), verbose = -10)
>>> > 1: doCBS(dataSet, tags = tags, chipTypes = c("Mapping250K_Nsp",
>>> >        "Mapping250K_Sty"), verbose = -10)
>>> >
>>> >
>>> >
>>> >
>>> >> sessionInfo()
>>> > R version 3.0.0 (2013-04-03)
>>> > Platform: x86_64-unknown-linux-gnu (64-bit)
>>> >
>>> > locale:
>>> > [1] C
>>> >
>>> > attached base packages:
>>> > [1] stats     graphics  grDevices utils     datasets  methods   base
>>> >
>>> > other attached packages:
>>> >  [1] R.cache_0.6.5          aroma.cn_1.3.3         DNAcopy_1.34.0
>>> >  [4] aroma.affymetrix_2.9.4 affxparser_1.32.1      aroma.apd_0.2.3
>>> >  [7] R.huge_0.4.1           aroma.light_1.30.2     aroma.core_2.9.5
>>> > [10] matrixStats_0.8.1      R.rsp_0.9.6            R.devices_2.2.2
>>> > [13] R.filesets_2.0.1       R.utils_1.23.2         R.oo_1.13.6
>>> > [16] R.methodsS3_1.4.2
>>> >
>>> > loaded via a namespace (and not attached):
>>> > [1] PSCBS_0.34.8 digest_0.6.3 tools_3.0.0
>>> >
>>> >
>>> >
>>> > On Wednesday, June 5, 2013 3:50:41 PM UTC-4, Henrik Bengtsson wrote:
>>> >>
>>> >> Hi.
>>> >>
>>> >> On Wed, Jun 5, 2013 at 11:31 AM, Wei Tang <tangw...@gmail.com> wrote:
>>> >> > Hi Henrik ,
>>> >> >
>>> >> > Thank you for you suggestion.
>>> >> >
>>> >> > but when I ran
>>> >> >
>>> >> > res <- doCBS(dataSet, tags=tags, chipTypes=c("Mapping250K_Nsp",
>>> >> > "Mapping250K_Sty"), verbose=verbose);
>>> >> >
>>> >> > it complained
>>> >> > "
>>> >> > Error in file(pathname, open = "rb") : invalid 'description'
>>> >> > argument
>>> >> > "
>>> >> >
>>> >> > do you know how to fix it?
>>> >>
>>> >> 1. What does traceback() output immediately after you get that error?
>>> >> 2. Can you show me your complete script?
>>> >> 3. What is your sessionInfo()?
>>> >>
>>> >> >
>>> >> > my situation is all paired tumor-normal, 36 paired-samples in SNP5
>>> >> > and
>>> >> > additional 20 paried-samples in 500K
>>> >> >
>>> >> > should I use "Multi-source copy-number normalization"
>>> >>
>>> >> Possibly - depending on the amount of attenuation in the different
>>> >> chip type hybridizations (depends on date, lab etc) you may see a
>>> >> small improvement in power to detect change points.  However, even
>>> >> without doing MSCN it is still always better to merge platforms (as
>>> >> doCBS() does) than running only single chips, cf. Figure 6 in H.
>>> >> Bengtsson, A. Ray, P. Spellman & T.P. Speed, A single-sample method
>>> >> for normalizing and combining full-resolution copy numbers from
>>> >> multiple platforms, labs and analysis methods, Bioinformatics 2009
>>> >> [http://aroma-project.org/publications].
>>> >>
>>> >> > and how about using "doASCRMAv2", does the usage the same as
>>> >> > "doCRMAv2"
>>> >> > ?;
>>> >>
>>> >> That's if you plan to infer parent-specific CNs.  If you don't know
>>> >> yet, use doASCRMAv2().  Everything should work the same with doCBS().
>>> >>
>>> >> /Henrik
>>> >>
>>> >> >
>>> >> > Many thanks,
>>> >> >
>>> >> > Wei
>>> >> >
>>> >> >
>>> >> > On Thursday, May 30, 2013 6:05:55 PM UTC-4, Henrik Bengtsson wrote:
>>> >> >>
>>> >> >> Hi,
>>> >> >>
>>> >> >> I've done some updates to the help pages (e.g. ?doCBS), so before
>>> >> >> anything I recommend to update to aroma.core 2.9.5 and
>>> >> >> aroma.affymetrix 2.9.4:
>>> >> >>
>>> >> >> source("http://aroma-project.org/hbLite.R";);
>>> >> >> hbInstall("aroma.affymetrix");
>>> >> >>
>>> >> >>
>>> >> >> On Tue, May 28, 2013 at 9:37 AM, Wei Tang <tangw...@gmail.com>
>>> >> >> wrote:
>>> >> >> > Hi aroma.affymetrix developers,
>>> >> >> >
>>> >> >> > Before I start the analysis, I just want to confirm the CN
>>> >> >> > analysis
>>> >> >> > of
>>> >> >> > 500K
>>> >> >> > arrays with doCRMAv2, as I did not find a Vig specific about it.
>>> >> >> >
>>> >> >> > What I understand is,
>>> >> >> >
>>> >> >> > 1. run 250K_Nsp
>>> >> >> > dsC_Nsp=doCRMAv2(test,cdf="Nsp",verbose=verbose)
>>> >> >> >
>>> >> >> > 2. run 250_Sty
>>> >> >> >
>>> >> >> > dsC_Sty=doCRMAv2(test,cdf="Sty",verbose=verbose)
>>> >> >>
>>> >> >> Yes, you can do CRMAv2 preprocessing for each chip type
>>> >> >> independently.
>>> >> >>  However, for doCRMAv2() you need to do something like:
>>> >> >>
>>> >> >> dsC_Nsp <- doCRMAv2(dataSet, chipType="Mapping250K_Nsp",
>>> >> >> verbose=verbose)
>>> >> >> dsC_Sty <- doCRMAv2(dataSet, chipType="Mapping250K_Sty",
>>> >> >> verbose=verbose)
>>> >> >>
>>> >> >> Chip types have formal and strict names, cf.
>>> >> >> http://aroma-project.org/definitions/chipTypesAndCDFs
>>> >> >>
>>> >> >> >
>>> >> >> > 3. merge them together by "aroma.cn"
>>> >> >>
>>> >> >> Actually, despite its name, you don't need to aroma.cn package
>>> >> >> here.
>>> >> >> The basic CBS methods are still in the aroma.core package.  So,
>>> >> >> after
>>> >> >> doing the above doCRMAv2() processing, you then want to do
>>> >> >> something
>>> >> >> like:
>>> >> >>
>>> >> >> tags <- "ACC,-XY,BPN,-XY,AVG,A+B,FLN,-XY";  # Tags added by CRMAv2
>>> >> >> res <- doCBS(dataSet, tags=tags, chipTypes=c("Mapping250K_Nsp",
>>> >> >> "Mapping250K_Sty"), verbose=verbose);
>>> >> >>
>>> >> >> It's important that the array *names* of the Mapping250K_Nsp and
>>> >> >> Mapping250K_Sty pair up, because that is how doCBS() know which
>>> >> >> array
>>> >> >> files to pair up/merge in the segmentation.   doCBS() match array
>>> >> >> names using the names from getNames(), e.g.
>>> >> >>
>>> >> >> names_Nsp <- getNames(dsC_Nsp);
>>> >> >> names_Sty <- getNames(dsC_Sty);
>>> >> >>
>>> >> >> If they don't match up, there are way to "change" the names so they
>>> >> >> do, cf. http://aroma-project.org/howtos/setFullNamesTranslator
>>> >> >>
>>> >> >> >
>>> >> >> > Would you mind telling me if I am correct with analysis?
>>> >> >> >
>>> >> >> > I also have SNP5.0 to merge, so should I merge 3 arrays at one
>>> >> >> > time
>>> >> >> > or,
>>> >> >> > merge 500K first and then SNP5.0?
>>> >> >>
>>> >> >> You can just include them as a third chiptype set above, e.g.
>>> >> >>
>>> >> >> res <- doCBS(dataSet, tags=tags, chipTypes=c("Mapping250K_Nsp",
>>> >> >> "Mapping250K_Sty", "GenomeWideSNP_5"), verbose=verbose);
>>> >> >>
>>> >> >> Hope this helps/get you started
>>> >> >>
>>> >> >> /Henrik
>>> >> >>
>>> >> >> >
>>> >> >> > Thank you very much,
>>> >> >> >
>>> >> >> > Wei
>>> >> >> >
>>> >> >> > NCI/NIH
>>> >> >> >
>>> >> >> >
>>> >> >> >
>>> >> >> > --
>>> >> >> > --
>>> >> >> > When reporting problems on aroma.affymetrix, make sure 1) to run
>>> >> >> > the
>>> >> >> > latest
>>> >> >> > version of the package, 2) to report the output of sessionInfo()
>>> >> >> > and
>>> >> >> > traceback(), and 3) to post a complete code example.
>>> >> >> >
>>> >> >> >
>>> >> >> > You received this message because you are subscribed to the
>>> >> >> > Google
>>> >> >> > Groups
>>> >> >> > "aroma.affymetrix" group with website
>>> >> >> > http://www.aroma-project.org/.
>>> >> >> > To post to this group, send email to aroma-af...@googlegroups.com
>>> >> >> > To unsubscribe and other options, go to
>>> >> >> > http://www.aroma-project.org/forum/
>>> >> >> >
>>> >> >> > ---
>>> >> >> > You received this message because you are subscribed to the
>>> >> >> > Google
>>> >> >> > Groups
>>> >> >> > "aroma.affymetrix" group.
>>> >> >> > To unsubscribe from this group and stop receiving emails from it,
>>> >> >> > send
>>> >> >> > an
>>> >> >> > email to aroma-affymetr...@googlegroups.com.
>>> >> >> > For more options, visit https://groups.google.com/groups/opt_out.
>>> >> >> >
>>> >> >> >
>>> >> >
>>> >> > --
>>> >> > --
>>> >> > When reporting problems on aroma.affymetrix, make sure 1) to run the
>>> >> > latest
>>> >> > version of the package, 2) to report the output of sessionInfo() and
>>> >> > traceback(), and 3) to post a complete code example.
>>> >> >
>>> >> >
>>> >> > You received this message because you are subscribed to the Google
>>> >> > Groups
>>> >> > "aroma.affymetrix" group with website http://www.aroma-project.org/.
>>> >> > To post to this group, send email to aroma-af...@googlegroups.com
>>> >> > To unsubscribe and other options, go to
>>> >> > http://www.aroma-project.org/forum/
>>> >> >
>>> >> > ---
>>> >> > You received this message because you are subscribed to the Google
>>> >> > Groups
>>> >> > "aroma.affymetrix" group.
>>> >> > To unsubscribe from this group and stop receiving emails from it,
>>> >> > send
>>> >> > an
>>> >> > email to aroma-affymetr...@googlegroups.com.
>>> >> > For more options, visit https://groups.google.com/groups/opt_out.
>>> >> >
>>> >> >
>>> >
>>> > --
>>> > --
>>> > When reporting problems on aroma.affymetrix, make sure 1) to run the
>>> > latest
>>> > version of the package, 2) to report the output of sessionInfo() and
>>> > traceback(), and 3) to post a complete code example.
>>> >
>>> >
>>> > You received this message because you are subscribed to the Google
>>> > Groups
>>> > "aroma.affymetrix" group with website http://www.aroma-project.org/.
>>> > To post to this group, send email to aroma-af...@googlegroups.com
>>> > To unsubscribe and other options, go to
>>> > http://www.aroma-project.org/forum/
>>> >
>>> > ---
>>> > You received this message because you are subscribed to the Google
>>> > Groups
>>> > "aroma.affymetrix" group.
>>> > To unsubscribe from this group and stop receiving emails from it, send
>>> > an
>>> > email to aroma-affymetr...@googlegroups.com.
>>> > For more options, visit https://groups.google.com/groups/opt_out.
>>> >
>>> >
>
> --
> --
> When reporting problems on aroma.affymetrix, make sure 1) to run the latest
> version of the package, 2) to report the output of sessionInfo() and
> traceback(), and 3) to post a complete code example.
>
>
> You received this message because you are subscribed to the Google Groups
> "aroma.affymetrix" group with website http://www.aroma-project.org/.
> To post to this group, send email to aroma-affymetrix@googlegroups.com
> To unsubscribe and other options, go to http://www.aroma-project.org/forum/
>
> ---
> You received this message because you are subscribed to the Google Groups
> "aroma.affymetrix" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to aroma-affymetrix+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

-- 
-- 
When reporting problems on aroma.affymetrix, make sure 1) to run the latest 
version of the package, 2) to report the output of sessionInfo() and 
traceback(), and 3) to post a complete code example.


You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group with website http://www.aroma-project.org/.
To post to this group, send email to aroma-affymetrix@googlegroups.com
To unsubscribe and other options, go to http://www.aroma-project.org/forum/

--- 
You received this message because you are subscribed to the Google Groups 
"aroma.affymetrix" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to aroma-affymetrix+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Re: [aroma.affymetrix] 500K by doCRMAv2

Reply via email to