Re: [R-sig-eco] twinspan classification rules as narrative

2019-12-17 Thread Gonzalez-Mirelis, Genoveva
Hi Jari and all, 

Thank you for your explanation, and indeed thank you for developing the R 
package, which I did download off your GitHub repo.

It seems like I picked a fairly easy example, so let me make sure that I have 
understood the rules correctly by describing here the previous division, namely 
division 4 (+Cladrang4 +Cladmero1 -Sterpasc5 -Empenigr3 +Callvulg2 < 2).

First of all, here are the species, and abundance thresholds that need to be 
checked at this division (I refer to these as "species conditions" later): 
Cladrang abundance > 5 (if so, +1), Cladmero abundance > 0 (if so, +1), 
Sterpasc abundance > 25 (if so, -1), Empenigr abundance > 1 (if so, -1) and 
Callvulg abundance > 0.1 (if so, +1). I'll just remind you that I have used the 
cutlevels in the Braun-Blanquet scale.

See:

ahti[which(rownames(ahti)%in%c("Ster113","Ster097", "Ster098")),
 which(colnames(ahti)%in%c("Cladrang", "Cladmero", "Sterpasc", "Empenigr", 
"Callvulg"))]

Now (as opposed to the example in my previous post) there are multiple ways in 
which the total score may be < 2. For example, site Ster113 (one of the three 
members of the final group) has Sterpasc present at an abundance > 25 and 
Empenigr at an abundance > 1; while the other three species conditions are not 
met, the total score is < 2. Similarly, site Ster 097 meets the conditions for 
the two negative indicators, as well as the condition for Cladmero, which is 
present at an abundance > 0. Etc. Correct? 

Then, would it also be fair to say that, if the total score threshold is "much" 
smaller than the total number of indicators (in this example the former is 2 
and the latter is 5) it is more "important" that the conditions for the 
negative indicators are met?

Keep in mind that what I'm ultimately trying to do is to provide a summarized 
description of species composition for all sites in a given group (and that I 
don't want this description to be much longer than it needs to be!).

Thank you very much again!

G


-----Original Message-
From: Jari Oksanen  
Sent: 16. desember 2019 16:39
To: Gonzalez-Mirelis, Genoveva 
Cc: r-sig-ecology@r-project.org
Subject: Re: [R-sig-eco] twinspan classification rules as narrative

Howdy,

TWINSPAN is not in CRAN. It seems that you found it in github. 

TWINSPAN is an old method, and it seems that people are forgetting how it 
works. Here some narrative:

First, you have defined cut levels to transform your abundance data into binary 
indicator “pseudospecies”. You give these cut levels in your call. Each species 
is split by these cut levels into pseudospecies, and that cut level number is 
added to the name of species. In your example, the indicator pseudospecies at 
division 8 are actually Cladgray1 and Cladnigr1 where the added ‘1’ just means 
that the species just occurs, but can have any abundance value: there is no way 
of knowing its abundance except for the lower limit (>0). In division 1 you 
have, for instance, Cladmiti4 which means that the species occurs at least at 
the cutlevel 4: at least at quantity 5, but it can have any value above that 
limit.

Now to the narrative for the division. The rule for division 8 (that you 
mention in your post) is actually "+Cladnigr1 +Cladgray1 < 1”. So they both are 
at the lowest cut level 1 (present with any abundance), the ‘+’ sign means that 
they are both positive indicator values and you add +1 for every plot where 
they occur. Would the sign be ‘-‘, you would add -1 for each presence to give 
negative scores. Doing this for all species gives you the indicator score: if 
both species are present, your score is 2, if one is present, your scores is 1 
and if neither is present your score is 0. The condition is ‘< 1’ meaning that 
if neither is present (score 0), the condition is true and you go to final 
group 16, but if one or both are present (scores 1 or 2), the  condition is 
false and you continue to division 17. However, this is a tree, and this 
narrative rule only applies to division 8 and those 16 sampling units it 
contains: these are split by this rule. To get to this division with this rule 
you must have satisfied the previous rules leading to this branch. You may see 
the branch structure using plot(twb): the internal divisions are shown in 
squared on tree, and the final groups and their sizes as terminal leaves.

The classification rules give you only the lower limit of species, and 
depending on the indicator score threshold, even some of these indicators may 
be missing in plot. However, you can use function twintable to see the actual 
cutlevels for each species. These serve as a cover-class values, but do not 
give any more detail than the cutlevels you defined.

Cheers, Jari

> On 16 Dec 2019, at 16:44, Gonzalez-Mirelis, Genoveva 
>  wrote:
> 
> Dea

Re: [R-sig-eco] twinspan classification rules as narrative

2019-12-16 Thread Jari Oksanen
Howdy,

TWINSPAN is not in CRAN. It seems that you found it in github. 

TWINSPAN is an old method, and it seems that people are forgetting how it 
works. Here some narrative:

First, you have defined cut levels to transform your abundance data into binary 
indicator “pseudospecies”. You give these cut levels in your call. Each species 
is split by these cut levels into pseudospecies, and that cut level number is 
added to the name of species. In your example, the indicator pseudospecies at 
division 8 are actually Cladgray1 and Cladnigr1 where the added ‘1’ just means 
that the species just occurs, but can have any abundance value: there is no way 
of knowing its abundance except for the lower limit (>0). In division 1 you 
have, for instance, Cladmiti4 which means that the species occurs at least at 
the cutlevel 4: at least at quantity 5, but it can have any value above that 
limit.

Now to the narrative for the division. The rule for division 8 (that you 
mention in your post) is actually "+Cladnigr1 +Cladgray1 < 1”. So they both are 
at the lowest cut level 1 (present with any abundance), the ‘+’ sign means that 
they are both positive indicator values and you add +1 for every plot where 
they occur. Would the sign be ‘-‘, you would add -1 for each presence to give 
negative scores. Doing this for all species gives you the indicator score: if 
both species are present, your score is 2, if one is present, your scores is 1 
and if neither is present your score is 0. The condition is ‘< 1’ meaning that 
if neither is present (score 0), the condition is true and you go to final 
group 16, but if one or both are present (scores 1 or 2), the  condition is 
false and you continue to division 17. However, this is a tree, and this 
narrative rule only applies to division 8 and those 16 sampling units it 
contains: these are split by this rule. To get to this division with this rule 
you must have satisfied the previous rules leading to this branch. You may see 
the branch structure using plot(twb): the internal divisions are shown in 
squared on tree, and the final groups and their sizes as terminal leaves.

The classification rules give you only the lower limit of species, and 
depending on the indicator score threshold, even some of these indicators may 
be missing in plot. However, you can use function twintable to see the actual 
cutlevels for each species. These serve as a cover-class values, but do not 
give any more detail than the cutlevels you defined.

Cheers, Jari

> On 16 Dec 2019, at 16:44, Gonzalez-Mirelis, Genoveva 
>  wrote:
> 
> Dear all,
> 
> I am trying to understand the results from the twinspan function in the R 
> package that has been recently developed (also named twinspan).
> 
> Particularly, I would like to be able to derive the classification rules 
> (indicator species and abundance values, or rather ranges) for each terminal 
> group of the twinspan classification.
> 
> From this:
> 
> library(twinspan)
> data(ahti)
> twb <- twinspan(ahti, cutlevels = c(0, 0.1, 1, 5, 25, 50, 75))
> summary(twb)
> 
> I understand that say, for group number 16 (the first terminal group 
> encountered) the indicator species were Cladnigr and Cladgray.
> 
> I also understand that the indicator score threshold tells me which path to 
> follow down the tree (left or right). But I struggle to understand just what 
> the indicator score means (1)? And whether it can be related to the original 
> abundance value for those two species at the three relevant sites, namely 
> Ster113, Ster097 and Ster098?
> 
> What would be a narrative way to describe this particular branch of the tree?
> 
> Many thanks in advance,
> 
> Genoveva
> 
> Genoveva Gonzalez Mirelis, Scientist
> Institute of Marine Research
> Nordnesgaten 50
> 5005 Bergen, Norway
> Phone number +47 55238510
> 
> 
>   [[alternative HTML version deleted]]
> 
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[R-sig-eco] twinspan classification rules as narrative

2019-12-16 Thread Gonzalez-Mirelis, Genoveva
Dear all,

I am trying to understand the results from the twinspan function in the R 
package that has been recently developed (also named twinspan).

Particularly, I would like to be able to derive the classification rules 
(indicator species and abundance values, or rather ranges) for each terminal 
group of the twinspan classification.

>From this:

library(twinspan)
data(ahti)
twb <- twinspan(ahti, cutlevels = c(0, 0.1, 1, 5, 25, 50, 75))
summary(twb)

I understand that say, for group number 16 (the first terminal group 
encountered) the indicator species were Cladnigr and Cladgray.

I also understand that the indicator score threshold tells me which path to 
follow down the tree (left or right). But I struggle to understand just what 
the indicator score means (1)? And whether it can be related to the original 
abundance value for those two species at the three relevant sites, namely 
Ster113, Ster097 and Ster098?

What would be a narrative way to describe this particular branch of the tree?

Many thanks in advance,

Genoveva

Genoveva Gonzalez Mirelis, Scientist
Institute of Marine Research
Nordnesgaten 50
5005 Bergen, Norway
Phone number +47 55238510


[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology