This is a reply to the message copied below from Roland Sookias
This depends slightly on the structure of the data. If you have crocodile tails
with a set of discrete shape states in the same anatomical part, then I would
recommend coding it as a multistate characteristic. MrBayes provides pretty
flexible coding for multi-state characters, and I think will handle up to at
least 10 states in a single character. There might be an implementation in R
now that is equivalent but I haven't done this in while. Anyone else want to
weigh in about an R implementation for multistate characters?
If you really have more than 10 states for a single character I'd be concerned
about whether multiple observers can consistently code those states anyway.
If instead you have truly inapplicable characters, for example crocodiles
without tails, and then ones with tails with various shapes, then I think the
best approach is to have two characters. A presense/absence character and a
You also can binarize every state as separate characters, but as you say it
introduces an implicit homology among qualitatively different absence states.
I've never liked this approach, but there are some articles defending it. I
think it's validity may depend on a relatively even frequency distribution of
each state. Also, if you have other anatomical parts in the analysis, then you
are effectively weighting more heavily those parts that you split into more
binary characters - they will tend to drive the result. Some software allow you
to explicitly correct character weights so that would solve this, but not all
software provide that. Note - this binarizing by state is a common approach in
phylogeny of language cognates - but I believe that has come about because of a
tail wagging the dog problem. BEAST, unless they changed it recently, didn't
support multistate character evolution and the language phylo people use BEAST
There are a bunch of articles about this issue but I'd have to go dig them up.
I think I cited them in some of my earlier papers that did anatomical and
cultural phylogenetic work.
Date: Thu, 8 Feb 2018 17:27:54 +0100
From: Roland Sookias <r.sook...@gmail.com>
Subject: [R-sig-phylo] Not inferring homology within "absence" state
in phylogenetic analysis
Content-Type: text/plain; charset="utf-8"
Maybe someone has some insight here...
I am coming up against the problem, when it comes to phylogenetic analysis.
Basically I want to conduct a parsimony (or other phylogenetic) analysis where
"inapplicable" scores are treated as separate states *for each taxon* .
I.e. I want to hypothesize shared ancestry for taxa scored with one state
(let's say state 0), but not hypothesize shared ancestry for the other taxa.
However, I still want to penalize a change in state from and to state 0.
There are three approaches which I have thought about, but none seems to fit
-Score all taxa not showing state 0 as separate states. This should do what I
want, but the problem here is the limit on the number of states in most
-Scoring binary presence/absence. The problem is here that it could end up
being parsimonious to group the "absence" state together, when there is no
reason to infer homology within (i.e. for taxa scored with) this state.
-Score as 0 and inapplicable. The problem is this does not penalize a change
from 0 to inapplicable.
A real life example, is the shape of the ilium in crocodiles. I want to say
that it is likely that a particular curve in the dorsal margin of the ilium in
crocodile-line crocodilians is homologous, but I don't want to hypothesize
homology of those taxa "lacking" this state. They are all equally far from each
Thanks very much indeed
[[alternative HTML version deleted]]
R-sig-phylo mailing list - Remail@example.com
Searchable archive at http://firstname.lastname@example.org/