[R-sig-phylo] ape package: improvement suggestion in color handling for mosaic in phydataplot?

2017-05-23 Thread Elizabeth Purdom
Dear all,

I am trying to use the phydataplot function in the ape package with the style 
mosaic. I am having difficulty getting the colors of the rectangles to be the 
color I desire in an automatic way. I have come to the conclusion that I can’t 
do and it would have to be an enhancement to the package, though I would be 
glad to know I am wrong. 

Here is the example in phydataplot for how to set the colors

set.seed(1378923)
## use type = "mosaic" on a 30x5 matrix:
tr <- rtree(n <- 30)
p <- 5
x <- matrix(sample(3, size = n*p, replace = TRUE), n, p)
dimnames(x) <- list(paste0("t", 1:n), LETTERS[1:p])
plot(tr, x.lim = 35, align.tip = TRUE, adj = 1)
phydataplot(x, tr, "m", 2, width = 2, border = "white", lwd = 3, legend = 
"side")
f <- function(n) c("yellow", "blue", "red")
phydataplot(x, tr, "m", 18, width = 2, border = "white", lwd = 3,
legend = "side", funcol = f)

Notice we set the colors with the function  c("yellow", "blue", "red”). 
However, we see when we plot them, that this assignment resulted in 2=yellow, 
3=blue, 1=red. If instead I want 1=yellow, 2=blue, 3=red, I can use this plot I 
just made to see how they should be ordered to get my desired color assignment, 
and rearrange the colors: 

plot(tr, x.lim = 35, align.tip = TRUE, adj = 1)
phydataplot(x, tr, "m", 2, width = 2, border = "white", lwd = 3, legend = 
"side")
f <- function(n) c("blue", "red", "yellow")
phydataplot(x, tr, "m", 18, width = 2, border = "white", lwd = 3,
legend = "side", funcol = f)

However, I see no way to do this without first plotting them, manually looking 
at the arbitrary order created, and then fixing it. Yet, I would like to do 
this in an automatic fashion — I’m trying to build a function on top of this 
function — so I need to know ahead of time how phydataplot will assign the 
colors (and so if my tree changes it will still give the desired colors). 
Looking at the internal function, the relevant stream of code is the following 
(i.e. cutting away the parts not needed for determining the colors) and shows 
that the assignment of colors to data is entirely dependent on the order of the 
data after the function internally matches it to the tree:

lastPP <- get("last_plot.phylo", envir = .PlotPhyloEnv)
x <- .matchDataPhylo(x, phy)
n <- length(phy$tip.label)
one2n <- seq_len(n)
lastPP <- get("last_plot.phylo", envir = .PlotPhyloEnv)
y1 <- lastPP$yy[one2n]
o <- order(y1)
x <- if (style == "image") x[o, o]
else if (is.vector(x)) x[o]
else x[o, ]
nux <- length(ux <- unique.default(x))
x <- match(x, ux)
co <- funcol(nux)
rect(xl, yb, xr, yt, col = co[x], xpd = TRUE, ...)

This means that the colors provide by the user are assigned to the values of 
unique(x) *after* x is reordered internally by .matchDataPhylo and then again 
reordered x[o,], where o is the order of the tips of the tree in the plot, I 
believe. This is not something returned to the user, unfortunately, by 
plot.phylo, but is assigned to an hidden environment variable :
assign("last_plot.phylo", c(L, list(edge = xe, xx = xx, yy = yy)), 
envir = .PlotPhyloEnv)

These seems to make it next to impossible to get the right colors without just 
trying it out interactively. If anyone had any other alternatives I’d love to 
hear them.

I’ve tried making a little function:
getColFun<-function(x,phy,namedColors){
x <- ape:::.matchDataPhylo(x, phy)
n <- length(phy$tip.label)
one2n <- seq_len(n)
lastPP <- get("last_plot.phylo", envir = .PlotPhyloEnv)
y1 <- lastPP$yy[one2n]
o <- order(y1)
ux<-unique.default(x[o])
m<-match(as.character(ux),names(namedColors))
function(n){namedColors[m]}
}
But it’s a bit awkward because it relies on .PlotPhyloEnv which is not always 
available if you haven’t loaded the package, as well as the internal function 
.matchDataPhylo. This is problematic for putting it into a package. Is there a 
function I’m not aware of that would be helpful?

In terms of enhancement, it would be nice if instead phydataplot accepted a 
named vector of colors, where the vector of colors is required to have names 
that match the values of the data in x, and then the phydataplot internally 
would figure out how to match them together. Less desirable would be if the 
output of plot.phylo returned the information in ‘L’ that is saved in the 
.PlotPhyloEnv, then I could at least recreate a more stable version of my 
little function above.

(it would also be nice if the legend of the colors was in some reasonable 
sorted order, i.e. 1,2,3, and not in the arbitrary order)

Thanks,
Elizabeth Purdom






[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org

[R-sig-phylo] possible bug in the phydataplot in ape package?

2017-05-23 Thread Elizabeth Purdom
Hello,

I am writing to the R-sig-phylo mailing list regarding what seems to be a bug 
in the `phydataplot` function in `ape`.

Specifically, if the style is `mosaic` but the data is a vector or 1-column 
matrix, the function fails. 

The following code is from the examples (that obviously works):
> tr <- rtree(n <- 30)
> p <- 5
> x <- matrix(sample(3, size = n*p, replace = TRUE), n, p)
> dimnames(x) <- list(paste0("t", 1:n), LETTERS[1:p])
> plot(tr, x.lim = 35, align.tip = TRUE, adj = 1)
> phydataplot(x, tr, "m", 2)

However, if I instead use only 1 column of x (either vector or force to remain 
a matrix):

> phydataplot(x[,1],tr,"m",2)
Error in seq.default(x0, x1, width) : invalid (to - from)/by in seq(.)
> phydataplot(x[,1,drop=FALSE],tr,"m",2)
Error in seq.default(x0, x1, width) : invalid (to - from)/by in seq(.)


This appears to be because of the following code within the function:
p <- ncol(x)
if (is.null(width)) {
x1 <- lastPP$x.lim[2]
width <- (x1 - x0)/p
} else x1 <- x0 + width * p
xx <- seq(x0, x1, width)

Since `x` is the output of `.matchDataPhylo(x, phy)`, it is always a vector, 
even if original x is a 1-column matrix, so that ncol(x) is NULL. Probably if 
there is a catch for when x is a vector to set p=1, then it would fix the 
problem, though I haven’t tested whether there are any further problems that 
come up.

All of the best,
Elizabeth Purdom

Here is my session info:
> sessionInfo()
R version 3.3.0 beta (2016-04-04 r70420)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.10.5 (Yosemite)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats4stats graphics  grDevices utils datasets  
methods   base 

other attached packages:
 [1] clusterExperiment_1.3.1ape_4.1
SummarizedExperiment_1.4.0
 [4] Biobase_2.34.0 GenomicRanges_1.26.4   GenomeInfoDb_1.10.3  
 
 [7] IRanges_2.8.2  S4Vectors_0.12.2   BiocGenerics_0.20.0  
 
[10] devtools_1.12.0   

loaded via a namespace (and not attached):
 [1] nlme_3.1-131   bitops_1.0-6   matrixStats_0.52.2 bold_0.4.0
 doParallel_1.0.10 
 [6] RColorBrewer_1.1-2 progress_1.1.2 httr_1.2.1 prabclus_2.2-6
 tools_3.3.0   
[11] R6_2.2.0   DBI_0.6-1  lazyeval_0.2.0 colorspace_1.3-2  
 ade4_1.7-6
[16] trimcluster_0.1-2  nnet_7.3-12withr_1.0.2gridExtra_2.2.1   
 prettyunits_1.0.2 
[21] xml2_1.1.1 pkgmaker_0.22  diptest_0.75-7 scales_0.4.1  
 DEoptimR_1.0-8
[26] mvtnorm_1.0-6  robustbase_0.92-7  NMF_0.20.6 commonmark_1.2
 stringr_1.2.0 
[31] digest_0.6.12  XVector_0.14.1 limma_3.30.13  howmany_0.3-1 
 jsonlite_1.4  
[36] mclust_5.2.3   dendextend_1.5.2   dplyr_0.5.0RCurl_1.95-4.8
 magrittr_1.5  
[41] modeltools_0.2-21  Matrix_1.2-10  Rcpp_0.12.10   munsell_0.4.3 
 abind_1.4-5   
[46] viridis_0.4.0  stringi_1.1.5  whisker_0.3-2  MASS_7.3-47   
 zlibbioc_1.20.0   
[51] flexmix_2.3-14 MAST_1.0.5 plyr_1.8.4 grid_3.3.0
 crayon_1.3.2  
[56] rncl_0.8.2 lattice_0.20-35splines_3.3.0  uuid_0.1-2
 taxize_0.8.4  
[61] fpc_2.1-10 rngtools_1.2.4 reshape2_1.4.2 codetools_0.2-15  
 XML_3.98-1.7  
[66] RNeXML_2.0.7   data.table_1.10.4  foreach_1.4.3  testthat_1.0.2
 locfdr_1.1-8  
[71] gtable_0.2.0   tidyr_0.6.2reshape_0.8.6  kernlab_0.9-25
 assertthat_0.2.0  
[76] ggplot2_2.2.1  gridBase_0.4-7 phylobase_0.8.4xtable_1.8-2  
 roxygen2_6.0.1
[81] class_7.3-14   viridisLite_0.2.0  tibble_1.3.0   iterators_1.0.8   
 registry_0.3  
[86] memoise_1.1.0  cluster_2.0.6 
> 
[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

Re: [R-sig-phylo] Rooting trees in R

2017-05-23 Thread Jacob Berv
see root() in the R package ape or reroot() in phytools
J

> On May 23, 2017, at 1:22 PM, Valentine Usongo  
> wrote:
> 
> Dear members
> 
> My name is Valentine Usongo and  I am a postdoctoral trainee at the INSPQ 
> Montreal-Quebec. I am working on the use of whole genome sequenced based 
> methods for the detection of foodborne outbreaks in Salmonella. I am still 
> a neophyte in the field of bioinformatics. I just concatenated 14 
> individual trees with the cat function in linux. These trees were in 
> newick format and they were generated using the SNVPhyl pipeline 
> of the National Reference Microbiology Laboratory of Canada. These trees 
> were generated by mapping my Salmonella isolates to a reference genome and 
> the only thing that differs in all the trees is the reference genome since 
> each tree has a different reference and my objective here was to assess 
> the impact of the choice of  reference genome on the ability of the SNV 
> method to separate outbreak from nonoutbreak isolates. My  goal is to 
> compare the topologies of these trees using the Kendall-Colijn (rooted) 
> method. This method requires that trees should be rooted but mine are not 
> rooted. 
> Is it possible to root unrooted trees in R or linux and if so how can one 
> go about it ? The metric requires that the input trees should be in the 
> nexus format and they should be rooted. My trees are in newick format and 
> I converted them to nexus in R but then got stuck because I cannot figure 
> how to root the trees. I would be very grateful for your input. 
> Looking forward to hearing from you. 
> Yours sincerely 
> Valentine Usongo 
>   [[alternative HTML version deleted]]
> 
> ___
> R-sig-phylo mailing list - R-sig-phylo@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/


[R-sig-phylo] Rooting trees in R

2017-05-23 Thread Valentine Usongo
Dear members

My name is Valentine Usongo and  I am a postdoctoral trainee at the INSPQ 
Montreal-Quebec. I am working on the use of whole genome sequenced based 
methods for the detection of foodborne outbreaks in Salmonella. I am still 
a neophyte in the field of bioinformatics. I just concatenated 14 
individual trees with the cat function in linux. These trees were in 
newick format and they were generated using the SNVPhyl pipeline 
of the National Reference Microbiology Laboratory of Canada. These trees 
were generated by mapping my Salmonella isolates to a reference genome and 
the only thing that differs in all the trees is the reference genome since 
each tree has a different reference and my objective here was to assess 
the impact of the choice of  reference genome on the ability of the SNV 
method to separate outbreak from nonoutbreak isolates. My  goal is to 
compare the topologies of these trees using the Kendall-Colijn (rooted) 
method. This method requires that trees should be rooted but mine are not 
rooted. 
Is it possible to root unrooted trees in R or linux and if so how can one 
go about it ? The metric requires that the input trees should be in the 
nexus format and they should be rooted. My trees are in newick format and 
I converted them to nexus in R but then got stuck because I cannot figure 
how to root the trees. I would be very grateful for your input. 
Looking forward to hearing from you. 
Yours sincerely 
Valentine Usongo 
[[alternative HTML version deleted]]

___
R-sig-phylo mailing list - R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/