Hello people

I am doing some thinking about how to analyse data on dimorphic animals - where different individuals of the same species have rather different morphology. An example of this is that some male beetles have large horns and small wings, and rely on beating the other guys up to get access to mates, whereas others have smaller horns and larger wings, and rely on mobility to find mates before the guys with the big horns turn up and beat them up. Normally what we do to see if this is happening is to plot a scatter plot of horn length vs body size. If it's a simple straight line relationship then no dimorphism, but if it's either a line with an obvious change of slope or two separate lines then you've got a dimorphic animal.

If you've just got a change of slope then it's relatively easy to test for significance by breakpoint regression, which will also tell you the body size at which the beetles switch from 'big wings, small horns' strategy to the other. What is more problematic, however, is if you have a dataset that is essentially two linear regressions that overlap in both X and Y. Here you can't do a breakpoint regression. The solution that I have come up with is to put a straight line between the two clouds of data and to use that line to define a two-level factor and fit a model with body size, the factor and the interaction to the horn size data. The line can then be moved up and down, and the slope varied, and the fit of the model for each intercept/slope combination compared. The best fit model can then be compared with a straight linear model with an F-test, which gives you a significance test, and the line allows you to decide which morph a particular beetle conforms to. I've written some R code to do this (see below). What i would like to know is a) if this problem has been dealt with before elsewhere, b) if there's a better way to do it and c) if my R function could be improved at all - it's my first attempt at such a thing, so any input would be really helpful. If anyone wants a sample data set then I can email you one.

Thanks for any input

Rob Knell

School of Biological Sciences
Queen Mary, University of London

'Phone +44 (0)20 7882 7720
http://www.qmw.ac.uk/~ugbt794
http://www.mopane.org

Here's the code:

switchline<- function(X,Y,Intmin,Intmax,Intgap,Slopemin,Slopemax,Slopegap)



{



#���X ������= unimodal variable�(e.g. filename$bodysize)



#���Y ������= bimodal variable�(e.g. filename$hornsize)



#����Intmin�������= Minimum switchline intercept



#����Slopemin�= Minimum switchline slope

#����Intmax�������= Maximum switchline intercept

#����Slopemax�= Maximum switchline slope

#����Intgap�������= Interval between each intercept

#����Slopegap�= Interval between each slope







#���This function written by Rob Knell 2003. It is an attempt to produce an

#����objective analysis for datasets of apparently dimorphic characters in which

#���there is overlap in both the X and Y variable. The routine divides the

#����individuals into one of two classes based on whether they are above or below

#���a line, and fits a linear model to the Y variable with the X variable as a

#����continuous explanatory variable and a single factor, "Morph", which is based

#���on whether they were above the line or below it, plus an interaction term.

#���The line is then adjusted, the individuals reclassified and the process

#����repeated. The output is of the form "Int" - intercept of the line dividing

#���one morph from another, "Slp" - the slope and "Rsqr", which gives the R-

#����squared value for the fitted model when the division into factors is based on

#���that particular line.



#���An error message probably means that one of your lines has missed the data

#����altogether - in other words, all of your data points are classified into a

#����single factor.

#







Lines.grid<- expand.grid(Intercept=seq(Intmin,Intmax,Intgap),Slope=seq(Slopemin,Slope max,Slopegap))



bandwidth<-length(Lines.grid$Intercept)





Lines<-matrix(nrow=bandwidth,ncol=3,data=NA)

dimnames(Lines)<-list(seq(1,bandwidth),c("Int","Slp","Rsqr"))



Lines[,1]<-Lines.grid$Intercept

Lines[,2]<-Lines.grid$Slope



for (i in 1:bandwidth)



����{

����temp.data<-matrix(nrow=length(X),ncol=4,data=NA)

����colnames(temp.data)<-c("X1","Y1","switchvalue","Morph")

����temp.data[,1]<-X

����temp.data[,2]<-Y

��������for(j in 1:length(X))

��������{

��������temp.data[j,3]<-(Lines[i,1]+Lines[i,2]*temp.data[j,1])

��������ifelse(temp.data[j,2]>=temp.data[j,3], temp.data[j,4]<-1, ��������temp.data[j,4]<-2)

��������}

����

����temp.data<-data.frame(temp.data)

����temp.data$Morph<-factor(temp.data$Morph)

����model<-lm(temp.data$Y1~temp.data$X1*temp.data$Morph)

����Lines[i,3]<-round(summary(model)$r.squared,6)

����}



print(Lines)



Lines<-data.frame(Lines)

Max<-max(Lines$Rsqr)

MaxI<-Lines$Int[Lines$Rsqr==Max]

MaxS<-Lines$Slp[Lines$Rsqr==Max]

Maxmat<-cbind(MaxI,MaxS)



print(Maxmat)

}

�����������������������

______________________________________________
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to