Dear Amy, This is to let you know that the insideFeature column has been fixed ( Version: 1.2.6). If you run your example below, you should get the correct classification. Please let me know if you encounter any problems. Thank you so much for the helpful feedback!
BTW, do you mind if I use your example below in the help manual? Thanks! Best regards, Julie ******************************************* Lihua Julie Zhu, Ph.D Research Associate Professor Program Gene Function and Expression University of Massachusetts Medical School 364 Plantation Street, Room 613 Worcester, MA 01605 508-856-5256 http://www.umassmed.edu/pgfe/faculty/zhu.cfm ******************************************* On 3/9/10 6:23 AM, "Amy Molesworth" <[email protected]> wrote: Firstly I'd like to thank the authors of the very useful package ChIPpeakAnno. I'd like to report a feature in ChIPpeakAnno annotatePeakInBatch function results that other users may or may not be aware of. I also propose improvements to compensate. The resulting insideFeature column reports TRUE if the query peak is either contained within an annotated feature, and also reports TRUE if it overlaps the end of an annotated feature. I think its worth noting that it reports FALSE if the peak overlaps the beginning of an annotated feature, and also reports FALSE if the peak overlaps in entirety an annotated feature(s). So, perhaps the insideFeature column (or additional new column called overlappingFeature) could report five options: ("false","inside","overlapStart","overlapEnd","super"). I haven't looked into the effects on how distanceToFeature should/could be called for each different scenario. Apologies if this has already been addressed, or if others do not consider this useful. Details with dummy example are described below. Many thanks, Amy. ##### In the dummy example below, p1 is bigger than f1 and consequently p1 overlaps it in entirety. It would be nice if ChIPpeakAnno could report this - although I accept it may overlap more than one feature, so would need to consider how to deal with that. And another example from below, p3 in fact overlaps with the start of f3, but is called as insideFeature=FALSE. It would be nice if ChIPpeakAnno could report it as OverlapStart. p4 is called as insideFeature = TRUE for overlapping with f4, but it would be nice if ChIPpeakAnno could report it as OverlapEnd or something similar. And correctly p2 is called as insideFeature = TRUE for overlap with f2, in this case p2 ranges are within the f2 ranges as you would expect. library(ChIPpeakAnno) peaks = RangedData(IRanges(start=c(1543200,1557200,1563000,1569800,167889600),end=c(1555199,1560599,1565199,1573799,167893599),names=c("p1","p2","p3","p4","p5")),strand=as.integer(1),space=c(6,6,6,6,5)) features = RangedData(IRanges(start=c(1549800,1554400,1565000,1569400,167888600),end=c(1550599,1560799,1565399,1571199,167888999),names=c("f1","f2","f3","f4","f5")),strand=as.integer(1),space=c(6,6,6,6,5)) annoPeaks = annotatePeakInBatch(peaks,AnnotationData=features) as.data.frame(annoPeaks) space start end width names strand feature start_position 1 5 167889600 167893599 4000 p5 1 f5 167888600 2 6 1543200 1555199 12000 p1 1 f1 1549800 3 6 1557200 1560599 3400 p2 1 f2 1554400 4 6 1563000 1565199 2200 p3 1 f3 1565000 5 6 1569800 1573799 4000 p4 1 f4 1569400 end_position insideFeature distancetoFeature 1 167888999 FALSE 1000 2 1550599 FALSE -6600 3 1560799 TRUE 2800 4 1565399 FALSE -2000 5 1571199 TRUE 400 > sessionInfo() R version 2.10.0 (2009-10-26) x86_64-unknown-linux-gnu locale: [1] LC_CTYPE=en_GB.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_GB.UTF-8 LC_COLLATE=C [5] LC_MONETARY=C LC_MESSAGES=C [7] LC_PAPER=en_GB.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] ChIPpeakAnno_1.3.0 org.Hs.eg.db_2.3.6 [3] GO.db_2.3.5 RSQLite_0.7-3 [5] DBI_0.2-4 AnnotationDbi_1.8.0 [7] BSgenome.Ecoli.NCBI.20080805_1.3.16 BSgenome_1.14.0 [9] Biostrings_2.14.2 IRanges_1.5.18 [11] multtest_2.2.0 Biobase_2.6.0 [13] biomaRt_2.3.0 loaded via a namespace (and not attached): [1] MASS_7.3-3 RCurl_1.3-0 XML_2.6-0 splines_2.10.0 [5] survival_2.35-7 ----------------------------------------------------------- This e-mail was sent by GlaxoSmithKline Services Unlimited (registered in England and Wales No. 1047315), which is a member of the GlaxoSmithKline group of companies. The registered address of GlaxoSmithKline Services Unlimited is 980 Great West Road, Brentford, Middlesex TW8 9GS. ----------------------------------------------------------- [[alternative HTML version deleted]] _______________________________________________ Bioc-sig-sequencing mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing [[alternative HTML version deleted]] _______________________________________________ Bioc-sig-sequencing mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
