My apologies for cross-postings

Hi all,

I am interested in performing a cluster analysis on ecological data from
forests in Pennsylvania.  I would like to develop definitions for forest
types (red maple forests, upland oak forests, etc.(AH AR in attached table))
based on measured attributes in each forest type.  To do this, I would like
to 'draw clusters' around forest types based on information from various
tree species (red maple, red oak, etc.(837, 832 in attached table))
occurring in those forests.  Each row of data includes mean values on a
particular species occurring within a forest type at a particular site.  In
other words, if we monitored 10 sites in red maple forests, we would only
have 10 rows of data for the tree species 'red maple', even though we
measured 100 trees.

I have used classification trees to examine this data, which I like because
of it's predictive abilities for later 'unknown' datasets.  However, my
concern is that the mean species attributes (columns Diameter:Avgnumtrees in
attached table) are associated with the tree species (nested?)(column
Treespecies in attached table) and are not independent attributes, but are
directly associated with the species listed in that row.

My question is, what is the best way to conduct a clustering (I have also
tried hclust, cclust and flexclust) or CART model with this sort of nested
data?
Also, what is the preferrable method for predicting a new dataset once these
clusters or CART models have been developed?

Any help would be greatly appreciated.

Kind regards,
Scott

> head(data_hal_dom, 15)
ForestType      COMMON_NAME     BasalArea       TreesperAcre    DeadperAcre     
VolumeperAcre
BiomassperAcre  AverageDiameter         STDERRDIAM      AVGHT   STDERRHT        
AVGNUMTREES
AH      blackoak        50      31.5    25.1    NA      950.9   47955   15.1    
1.1     86.8    15.2    4
AH      chestnutoak     50      11.2    12      NA      231.9   16713.8 13.1    
0.3     55      4.2     2
AH      northern        oak     50      45.3    37.6    NA      1319.7  82508.2 
14.7    0.9     81.5    7       6
AH      redmaple        50      51.9    66.2    NA      1564.4  60960.9 12      
0.2     70.3    2.5     3
AH      redpine 50      8.8     9.3     NA      189.4   8106.9  13.2    0       
42      0       1
AH      scarletoak      50      41.2    27.9    NA      1211    67645.6 16.3    
1.5     80.3    12.4    3
AH      whiteoak        50      10.4    9.2     NA      264.1   15738.6 14.4    
0.3     73.3    0       1.3
AR      northern        oak     50      47.2    30.1    12      1506.4  93490   
16.9    0.9     84.2    10.7    5
AR      paperbirch      50      7.5     6       NA      243.7   9637    15.1    
0       77      0       1
AR      redmaple        50      7.1     6       6       226.7   9102.2  14.6    
0       75      0       1
AR      sweetbirch      50      4.7     6       NA      146.3   6676.2  12      
0       75.5    0       1
AR      whiteash        50      6.8     6       NA      261.5   9474.5  14.4    
0       106     0       1
AR      yellow-poplar   50      23.8    18.1    NA      962.1   28302.8 15.3    
2.1     99.3    6.8     3
AR      easternhemlock  70      16.6    6       NA      512.6   17125.8 22.5    
0       94      0       1
AR      northern        oak     70      16.2    6       12      583.4   38060.4 
22.2    0       110     0       1

Scott Bearer
Forest Ecologist
The Nature Conservancy
 in Pennsylvania
Community Arts Center
220 West Fourth Street, 3rd Floor
Williamsport, PA  17701

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to