[R-sig-phylo] ape package: improvement suggestion in color handling for mosaic in phydataplot?
Dear all, I am trying to use the phydataplot function in the ape package with the style mosaic. I am having difficulty getting the colors of the rectangles to be the color I desire in an automatic way. I have come to the conclusion that I can’t do and it would have to be an enhancement to the package, though I would be glad to know I am wrong. Here is the example in phydataplot for how to set the colors set.seed(1378923) ## use type = "mosaic" on a 30x5 matrix: tr <- rtree(n <- 30) p <- 5 x <- matrix(sample(3, size = n*p, replace = TRUE), n, p) dimnames(x) <- list(paste0("t", 1:n), LETTERS[1:p]) plot(tr, x.lim = 35, align.tip = TRUE, adj = 1) phydataplot(x, tr, "m", 2, width = 2, border = "white", lwd = 3, legend = "side") f <- function(n) c("yellow", "blue", "red") phydataplot(x, tr, "m", 18, width = 2, border = "white", lwd = 3, legend = "side", funcol = f) Notice we set the colors with the function c("yellow", "blue", "red”). However, we see when we plot them, that this assignment resulted in 2=yellow, 3=blue, 1=red. If instead I want 1=yellow, 2=blue, 3=red, I can use this plot I just made to see how they should be ordered to get my desired color assignment, and rearrange the colors: plot(tr, x.lim = 35, align.tip = TRUE, adj = 1) phydataplot(x, tr, "m", 2, width = 2, border = "white", lwd = 3, legend = "side") f <- function(n) c("blue", "red", "yellow") phydataplot(x, tr, "m", 18, width = 2, border = "white", lwd = 3, legend = "side", funcol = f) However, I see no way to do this without first plotting them, manually looking at the arbitrary order created, and then fixing it. Yet, I would like to do this in an automatic fashion — I’m trying to build a function on top of this function — so I need to know ahead of time how phydataplot will assign the colors (and so if my tree changes it will still give the desired colors). Looking at the internal function, the relevant stream of code is the following (i.e. cutting away the parts not needed for determining the colors) and shows that the assignment of colors to data is entirely dependent on the order of the data after the function internally matches it to the tree: lastPP <- get("last_plot.phylo", envir = .PlotPhyloEnv) x <- .matchDataPhylo(x, phy) n <- length(phy$tip.label) one2n <- seq_len(n) lastPP <- get("last_plot.phylo", envir = .PlotPhyloEnv) y1 <- lastPP$yy[one2n] o <- order(y1) x <- if (style == "image") x[o, o] else if (is.vector(x)) x[o] else x[o, ] nux <- length(ux <- unique.default(x)) x <- match(x, ux) co <- funcol(nux) rect(xl, yb, xr, yt, col = co[x], xpd = TRUE, ...) This means that the colors provide by the user are assigned to the values of unique(x) *after* x is reordered internally by .matchDataPhylo and then again reordered x[o,], where o is the order of the tips of the tree in the plot, I believe. This is not something returned to the user, unfortunately, by plot.phylo, but is assigned to an hidden environment variable : assign("last_plot.phylo", c(L, list(edge = xe, xx = xx, yy = yy)), envir = .PlotPhyloEnv) These seems to make it next to impossible to get the right colors without just trying it out interactively. If anyone had any other alternatives I’d love to hear them. I’ve tried making a little function: getColFun<-function(x,phy,namedColors){ x <- ape:::.matchDataPhylo(x, phy) n <- length(phy$tip.label) one2n <- seq_len(n) lastPP <- get("last_plot.phylo", envir = .PlotPhyloEnv) y1 <- lastPP$yy[one2n] o <- order(y1) ux<-unique.default(x[o]) m<-match(as.character(ux),names(namedColors)) function(n){namedColors[m]} } But it’s a bit awkward because it relies on .PlotPhyloEnv which is not always available if you haven’t loaded the package, as well as the internal function .matchDataPhylo. This is problematic for putting it into a package. Is there a function I’m not aware of that would be helpful? In terms of enhancement, it would be nice if instead phydataplot accepted a named vector of colors, where the vector of colors is required to have names that match the values of the data in x, and then the phydataplot internally would figure out how to match them together. Less desirable would be if the output of plot.phylo returned the information in ‘L’ that is saved in the .PlotPhyloEnv, then I could at least recreate a more stable version of my little function above. (it would also be nice if the legend of the colors was in some reasonable sorted order, i.e. 1,2,3, and not in the arbitrary order) Thanks, Elizabeth Purdom [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org
[R-sig-phylo] possible bug in the phydataplot in ape package?
Hello, I am writing to the R-sig-phylo mailing list regarding what seems to be a bug in the `phydataplot` function in `ape`. Specifically, if the style is `mosaic` but the data is a vector or 1-column matrix, the function fails. The following code is from the examples (that obviously works): > tr <- rtree(n <- 30) > p <- 5 > x <- matrix(sample(3, size = n*p, replace = TRUE), n, p) > dimnames(x) <- list(paste0("t", 1:n), LETTERS[1:p]) > plot(tr, x.lim = 35, align.tip = TRUE, adj = 1) > phydataplot(x, tr, "m", 2) However, if I instead use only 1 column of x (either vector or force to remain a matrix): > phydataplot(x[,1],tr,"m",2) Error in seq.default(x0, x1, width) : invalid (to - from)/by in seq(.) > phydataplot(x[,1,drop=FALSE],tr,"m",2) Error in seq.default(x0, x1, width) : invalid (to - from)/by in seq(.) This appears to be because of the following code within the function: p <- ncol(x) if (is.null(width)) { x1 <- lastPP$x.lim[2] width <- (x1 - x0)/p } else x1 <- x0 + width * p xx <- seq(x0, x1, width) Since `x` is the output of `.matchDataPhylo(x, phy)`, it is always a vector, even if original x is a 1-column matrix, so that ncol(x) is NULL. Probably if there is a catch for when x is a vector to set p=1, then it would fix the problem, though I haven’t tested whether there are any further problems that come up. All of the best, Elizabeth Purdom Here is my session info: > sessionInfo() R version 3.3.0 beta (2016-04-04 r70420) Platform: x86_64-apple-darwin13.4.0 (64-bit) Running under: OS X 10.10.5 (Yosemite) locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] parallel stats4stats graphics grDevices utils datasets methods base other attached packages: [1] clusterExperiment_1.3.1ape_4.1 SummarizedExperiment_1.4.0 [4] Biobase_2.34.0 GenomicRanges_1.26.4 GenomeInfoDb_1.10.3 [7] IRanges_2.8.2 S4Vectors_0.12.2 BiocGenerics_0.20.0 [10] devtools_1.12.0 loaded via a namespace (and not attached): [1] nlme_3.1-131 bitops_1.0-6 matrixStats_0.52.2 bold_0.4.0 doParallel_1.0.10 [6] RColorBrewer_1.1-2 progress_1.1.2 httr_1.2.1 prabclus_2.2-6 tools_3.3.0 [11] R6_2.2.0 DBI_0.6-1 lazyeval_0.2.0 colorspace_1.3-2 ade4_1.7-6 [16] trimcluster_0.1-2 nnet_7.3-12withr_1.0.2gridExtra_2.2.1 prettyunits_1.0.2 [21] xml2_1.1.1 pkgmaker_0.22 diptest_0.75-7 scales_0.4.1 DEoptimR_1.0-8 [26] mvtnorm_1.0-6 robustbase_0.92-7 NMF_0.20.6 commonmark_1.2 stringr_1.2.0 [31] digest_0.6.12 XVector_0.14.1 limma_3.30.13 howmany_0.3-1 jsonlite_1.4 [36] mclust_5.2.3 dendextend_1.5.2 dplyr_0.5.0RCurl_1.95-4.8 magrittr_1.5 [41] modeltools_0.2-21 Matrix_1.2-10 Rcpp_0.12.10 munsell_0.4.3 abind_1.4-5 [46] viridis_0.4.0 stringi_1.1.5 whisker_0.3-2 MASS_7.3-47 zlibbioc_1.20.0 [51] flexmix_2.3-14 MAST_1.0.5 plyr_1.8.4 grid_3.3.0 crayon_1.3.2 [56] rncl_0.8.2 lattice_0.20-35splines_3.3.0 uuid_0.1-2 taxize_0.8.4 [61] fpc_2.1-10 rngtools_1.2.4 reshape2_1.4.2 codetools_0.2-15 XML_3.98-1.7 [66] RNeXML_2.0.7 data.table_1.10.4 foreach_1.4.3 testthat_1.0.2 locfdr_1.1-8 [71] gtable_0.2.0 tidyr_0.6.2reshape_0.8.6 kernlab_0.9-25 assertthat_0.2.0 [76] ggplot2_2.2.1 gridBase_0.4-7 phylobase_0.8.4xtable_1.8-2 roxygen2_6.0.1 [81] class_7.3-14 viridisLite_0.2.0 tibble_1.3.0 iterators_1.0.8 registry_0.3 [86] memoise_1.1.0 cluster_2.0.6 > [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
Re: [R-sig-phylo] Rooting trees in R
see root() in the R package ape or reroot() in phytools J > On May 23, 2017, at 1:22 PM, Valentine Usongo> wrote: > > Dear members > > My name is Valentine Usongo and I am a postdoctoral trainee at the INSPQ > Montreal-Quebec. I am working on the use of whole genome sequenced based > methods for the detection of foodborne outbreaks in Salmonella. I am still > a neophyte in the field of bioinformatics. I just concatenated 14 > individual trees with the cat function in linux. These trees were in > newick format and they were generated using the SNVPhyl pipeline > of the National Reference Microbiology Laboratory of Canada. These trees > were generated by mapping my Salmonella isolates to a reference genome and > the only thing that differs in all the trees is the reference genome since > each tree has a different reference and my objective here was to assess > the impact of the choice of reference genome on the ability of the SNV > method to separate outbreak from nonoutbreak isolates. My goal is to > compare the topologies of these trees using the Kendall-Colijn (rooted) > method. This method requires that trees should be rooted but mine are not > rooted. > Is it possible to root unrooted trees in R or linux and if so how can one > go about it ? The metric requires that the input trees should be in the > nexus format and they should be rooted. My trees are in newick format and > I converted them to nexus in R but then got stuck because I cannot figure > how to root the trees. I would be very grateful for your input. > Looking forward to hearing from you. > Yours sincerely > Valentine Usongo > [[alternative HTML version deleted]] > > ___ > R-sig-phylo mailing list - R-sig-phylo@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo > Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/
[R-sig-phylo] Rooting trees in R
Dear members My name is Valentine Usongo and I am a postdoctoral trainee at the INSPQ Montreal-Quebec. I am working on the use of whole genome sequenced based methods for the detection of foodborne outbreaks in Salmonella. I am still a neophyte in the field of bioinformatics. I just concatenated 14 individual trees with the cat function in linux. These trees were in newick format and they were generated using the SNVPhyl pipeline of the National Reference Microbiology Laboratory of Canada. These trees were generated by mapping my Salmonella isolates to a reference genome and the only thing that differs in all the trees is the reference genome since each tree has a different reference and my objective here was to assess the impact of the choice of reference genome on the ability of the SNV method to separate outbreak from nonoutbreak isolates. My goal is to compare the topologies of these trees using the Kendall-Colijn (rooted) method. This method requires that trees should be rooted but mine are not rooted. Is it possible to root unrooted trees in R or linux and if so how can one go about it ? The metric requires that the input trees should be in the nexus format and they should be rooted. My trees are in newick format and I converted them to nexus in R but then got stuck because I cannot figure how to root the trees. I would be very grateful for your input. Looking forward to hearing from you. Yours sincerely Valentine Usongo [[alternative HTML version deleted]] ___ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/