[background for r-sig-phylo: some of us have been talking about
the problems of grabbing trees from the literature when they
are not available in TreeBase or as Nexus or Newick format
from the authors. Reconstructing Newick format from a big
tree is a huge pain, as anyone who has tried it will know, and
even then one wants the branch lengths as well as the topology]
The problem of reconstructing trees from a set of (x,y) points
turns out not to be all that hard -- even "trivial" from the
computational point of view. The R function below takes
a set of (x,y) points, number of tips, and tip labels, and
returns a tree in "phylo" format [it assumes that all
the tips are first in the list of points, otherwise I think
order shouldn't matter]. I haven't tried it on
ultrametric trees, and I know that polytomies will
be trouble.
The examples below take the node (x,y) locations from
some of the ape examples (the tiny "owl tree" and the
bird.orders data set), which are retrievable using some
black magic, and reconstruct the trees. **The trees do
not come back in the same order** (is this a problem?)
but they are equivalent.
Getting the (x,y) points into R in the first place is also
a potential challenge. Two possible solutions: use g3data
(notes included below), a standalone, cross-platform
utility for retrieving point locations from image files.
One could also write a small R program that took
an image file, plotted it, and use locator() to get
the points (using pixmap:::read.pnm?).
I think I've written something like this
before, but would have to dig it up or redo it -- and
g3data has a nicer interface.
##
library(ape)
## from ?plot.tree:
cat("(((Strix_aluco:4.2,Asio_otus:4.2):3.1,",
"Athene_noctua:7.3):6.3,Tyto_alba:13.5);",
file = "ex.tre", sep = "\n")
tree.owls <- read.tree("ex.tre")
plot(tree.owls)
unlink("ex.tre") # delete the file "ex.tre"
plot(tree.owls)
xy <- get("last_plot.phylo",envir=.PlotPhyloEnv)
xx <- xy$xx
yy <- xy$yy
points(xx,yy,col="white",pch=16,cex=2)
text(xx,yy,col=2,1:length(xx))
## assumes left-to-right horizontal tree -- may need some logic for
## different directions
## assumes first N points are tips.
##
## polytomies?? may need to be explicitly identified ...
## should?? work on non-ultrametric trees, but untested
build.tree <- function(xx,yy,tip.labels,ntips,
poly=numeric(0),
debug=FALSE) {
if (!missing(tip.labels)) ntips <- length(tip.labels)
nodes <- 1:length(xx)
is.tip <- nodes<=ntips
if (which.min(xx)!=ntips+1) {
## reorder nodes the way ape/phylo expects
yy[internal] <- rev(yy[!is.tip])[order(xx[!is.tip])]
xx[internal] <- rev(yy[!is.tipl])[order(xx[!is.tip])]
}
edges <- matrix(nrow=0,ncol=2)
edge.length <- numeric(0)
nnode <- length(xx)-ntips
while (length(xx)>1) {
## find next node to include
nextnode <- which(!is.tip & xx==max(xx[!is.tip]))[1]
## find daughters
dist <- abs(yy-yy[nextnode])
daughters <- which(is.tip & dist==min(dist[is.tip]))
## be careful with numeric fuzz?
edges <- rbind(edges,
nodes[c(nextnode,daughters[1])],
nodes[c(nextnode,daughters[2])])
edge.length <- c(edge.length,xx[daughters]-xx[nextnode])
xx <- xx[-daughters]
yy <- yy[-daughters]
is.tip[nextnode] <- TRUE
is.tip <- is.tip[-daughters]
nodes <- nodes[-daughters]
}
zz <- list(tip.labels=tip.labels,
edge=edges,
edge.length=edge.length,
Nnode=nnode)
class(zz) <- "phylo"
zz <- reorder(zz)
zz
}
newtree <- build.tree(xx,yy,tree.owls$tip.label)
data(bird.orders)
plot(bird.orders,show.node.label=TRUE)
xy <- get("last_plot.phylo",envir=.PlotPhyloEnv)
points(xx,yy,col="white",pch=16,cex=2)
text(xx,yy,col=2,1:length(xx))
xx <- xy$xx
yy <- xy$yy
newtree2 <- build.tree(xx,yy,bird.orders$tip.label)
===========
g3data notes:
============
INSTALLATION: install g3data and (for Windows) clip2png.jar
Ubuntu and other Debians:
sudo apt-get g3data
Windows:
http://www.frantz.fi/software/Windows/g3data-1.5.1-win32.zip (for
windows)
Mac (OS X 10.4 or 10.5): available via fink
http://www.finkproject.org/doc/users-guide/index.php
fink install g3data (?) or
fink -b install g3data
get clip2png.jar :
google "clip2png.jar", or go to ...
http://sourceforge.net/project/showfiles.php?group_id=185579
click on "download"
scroll down and click on "clip2png.jar"
save it somewhere (desktop?)
USAGE
open the paper in your favorite PDF viewer
select the desired figure, including axes but as little else as
possible,
and copy to the clipboard, then save the clipboard as a PNG or GIF
OR adjust the PDF window so the figure fills it and take a snapshot
of the Window (on Ubuntu: alt-printscreen), save as PNG or GIF
open g3data
click on two points on the X and Y axis, fill in values
click on points
if you need to compress the display so that you can see the output
actions,
use the View menu or function keys to toggle display of zoom area
(F5),
axis settings (F6), or output properties (F7)
for multiple series, either click on points in order (e.g. work
left-
to-right
for each series), then edit your output to put tags on increasing
series,
or output each series to a separate data file
note that by default g3data will save your data to a file named
after
your graphics file, e.g. "mydata.png.dat" -- which means that it will
show up in Windows as a file called "mydata.png", with a DAT file
type -- which may be confusing.
reading into excel: use "Data" menu to separate into columns
Wish list for g3data:
csv format output?
series tagging?
keyboard shortcuts for Save (Ctrl-S), Save As (Ctrl-A)?
built-in documentation?
plot(newtree2)
_______________________________________________
R-sig-phylo mailing list
R-sig-phylo@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo