Re: [R-sig-eco] The final result of TWINSPAN

Dave Roberts Wed, 27 Apr 2011 10:47:37 -0700

Thanks Jari,

My original thought was to write a wrapper for the original FORTRANcode by replacing the file read of data with data passed from R, andthen bringing in the results in a list. That would allow sizing thearrays at run-time and eliminating fixed array sizes. I have a copy ofthe FORTRAN code that Petr Smilauer modified for simplifiedinput/outputand that helped. Still, it ultimately appeared pretty messy(but still might be the best route), so I tried separating out thesubroutines and calling them individually from R. From there I triedreplacing some of the subroutines with native R to lower overhead. Butin the end I just couldn't understand the code well enough to make it work.

So then I thought I should write write a totally transparentversion in native R, even if it doesn't replicate the original. On thedown side people can say it's not correct; on the upside it's opensource and people can evaluate it and modify it as they see fit. So, ifthere is interest I might post the code and examples on my web page andlet somebody else have it to run with.


Dave

Jari Oksanen wrote:

On 27/04/11 00:40 AM, "Dave Roberts" <dvr...@ecology.msu.montana.edu> wrote:

     Earlier this year on an (undoubtedly ill-advised) lark I coded up
an R version of TWINSPAN.  It's far from a polished package at this
point, but the code does run.  One of the interesting features is that
you can partition a PCO or NMDS in addition to the traditional CA. To be
clear, I am not a TWINSPAN fan either, but I wanted it for a methods
paper I was working on.

     The problem is that I based the code on Hill, Bunch & Shaw (1975,
J of  Ecol  63:597-613) which is what I had available.  Apparently the
algorithm in the commercial TWINSPAN is significantly modified from the
original, but I couldn't find a description of the actual algorithm
anywhere in the literature.  It is probably described in the User Manual
of the software, but I was not sufficiently motivated to chase down a
copy.  I do have a copy of the FORTRAN code, but it was apparently
written in FORTRAN II, and is basically inscrutable, even to an old
FORTRAN dog like me.

     So, if somebody has a clear description of the actual algorithm
(and I think it is disturbing that I could not find one), it would be
possible to code it up in native R.  The alternative, to write a wrapper
for the original FORTRAN code is not a trivial task.  I gave it a couple
of days and gave up.


Dave,

Hill, Bunch & Shaw describe the general idea of TWINSPAN, but the
implementation is more complicated. Martin Kent and Paddy Coker do a great
job of explaining the twists in their book ("vegetation description and
analysis: a practical approach"). If I remember correctly, the TWINSPAN
manual also was more detailed, but I lost it somewhere when I moved around
(for the kids: it was a bunch of paper: pdf was not yet invented when
TWINSPAN was published).

I don't think that the actual TWINSPAN is easily extended beyond CA. Each
step is a two-stage one-dimensional ordination on a current subset, where
the first stage selects indicators and the second stage is polarized for the
indicator species. The final split is based on site ordination and
indicators are secondary (which we see in misclassifications if you try to
use the provided key for the data that was classified in TWINSPAN). The
polarization stage is particularly challenging when working with
dissimilarities (PCO, NMDS).

I don't think that the FORTRAN I have is completely impenetrable. I think
the largest problem is the design principle: R code should run silently and
return a result, but TWINSPAN prints when it goes on and returns only a part
of the result. Incorporating that in R would need stripping most PRINT and
WRITE and have subroutines to return useful data directly.

I also wrote a small funny test on TWINSPAN principle, where the splitting
and pre-defined pseudospecies where replaced with regression tree split.
I'll send you a copy of that and the FORTRAN (IV, I think) code I have in a
separate message.

Cheers, Jari Oksanen

_______________________________________________
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


_______________________________________________
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

Re: [R-sig-eco] The final result of TWINSPAN

Reply via email to