[R-sig-eco] Temporal clustering of factors

2013-08-15 Thread Chris McOwen
Dear list,

I have 50 sites where information was recorded over a 45 year time period. The 
recorded data could take one of four forms: Fishing effort, Environmental, Both 
or Inconclusive.

What i am aiming to do is cluster sites based on their similarity through time, 
essentially i view this as being similar to making a phylogeny, where instead 
of a genetic sequence i have a sequence of factors.

I was thinking of using Gower distance to create a dissimilarity matrix and go 
from there but i don't think this captures what i am looking for?

Any suggestions would be gratefully received.


For space i have restricted the sample data to 4 sites

temporal_sites - structure(list(Year = c(1959L, 1960L, 1961L, 1962L, 1963L, 
1964L,
1965L, 1966L, 1967L, 1968L, 1969L, 1970L, 1971L, 1972L, 1973L,
1974L, 1975L, 1976L, 1977L, 1978L, 1979L, 1980L, 1981L, 1982L,
1983L, 1984L, 1985L, 1986L, 1987L, 1988L, 1989L, 1990L, 1991L,
1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 1998L, 1999L, 2000L,
2001L, 2002L, 2003L, 2004L, 1959L, 1960L, 1961L, 1962L, 1963L,
1964L, 1965L, 1966L, 1967L, 1968L, 1969L, 1970L, 1971L, 1972L,
1973L, 1974L, 1975L, 1976L, 1977L, 1978L, 1979L, 1980L, 1981L,
1982L, 1983L, 1984L, 1985L, 1986L, 1987L, 1988L, 1989L, 1990L,
1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 1998L, 1999L,
2000L, 2001L, 2002L, 2003L, 2004L, 1959L, 1960L, 1961L, 1962L,
1963L, 1964L, 1965L, 1966L, 1967L, 1968L, 1969L, 1970L, 1971L,
1972L, 1973L, 1974L, 1975L, 1976L, 1977L, 1978L, 1979L, 1980L,
1981L, 1982L, 1983L, 1984L, 1985L, 1986L, 1987L, 1988L, 1989L,
1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 1998L,
1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 1959L, 1960L, 1961L,
1962L, 1963L, 1964L, 1965L, 1966L, 1967L, 1968L, 1969L, 1970L,
1971L, 1972L, 1973L, 1974L, 1975L, 1976L, 1977L, 1978L, 1979L,
1980L, 1981L, 1982L, 1983L, 1984L, 1985L, 1986L, 1987L, 1988L,
1989L, 1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L,
1998L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L), Site = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c(A, B, C, D), class = factor),
Factor = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c(Both,
Environmental, Fishing Effort), class = factor)), .Names = c(Year,
Site, Factor), class = data.frame, row.names = c(NA, -184L
))




[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Temporal clustering of factors

2013-08-15 Thread Sarah Goslee
Hi Chris,

 I have 50 sites where information was recorded over a 45 year time period. 
 The recorded data could take one of four forms: Fishing effort, 
 Environmental, Both or Inconclusive.

 What i am aiming to do is cluster sites based on their similarity through 
 time, essentially i view this as being similar to making a phylogeny, where 
 instead of a genetic sequence i have a sequence of factors.

 I was thinking of using Gower distance to create a dissimilarity matrix and 
 go from there but i don't think this captures what i am looking for?

Why not? What are you looking for? Have you looked into change vector
analysis or other multivariate time series methods?

Though if this is representative, you don't have much information,
only four categories of a single factor: why not just plot them?

Please don't cross-post or post multiple times. If you don't get an
instantaneous response, then the volunteers who answer questions might
be busy, or your question might be poorly formed or vague. Give the
list some time, at least a day, to get caught up.

Sarah

 Any suggestions would be gratefully received.


 For space i have restricted the sample data to 4 sites

 temporal_sites - structure(list(Year = c(1959L, 1960L, 1961L, 1962L, 1963L, 
 1964L,
 1965L, 1966L, 1967L, 1968L, 1969L, 1970L, 1971L, 1972L, 1973L,
 1974L, 1975L, 1976L, 1977L, 1978L, 1979L, 1980L, 1981L, 1982L,
 1983L, 1984L, 1985L, 1986L, 1987L, 1988L, 1989L, 1990L, 1991L,
 1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 1998L, 1999L, 2000L,
 2001L, 2002L, 2003L, 2004L, 1959L, 1960L, 1961L, 1962L, 1963L,
 1964L, 1965L, 1966L, 1967L, 1968L, 1969L, 1970L, 1971L, 1972L,
 1973L, 1974L, 1975L, 1976L, 1977L, 1978L, 1979L, 1980L, 1981L,
 1982L, 1983L, 1984L, 1985L, 1986L, 1987L, 1988L, 1989L, 1990L,
 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 1998L, 1999L,
 2000L, 2001L, 2002L, 2003L, 2004L, 1959L, 1960L, 1961L, 1962L,
 1963L, 1964L, 1965L, 1966L, 1967L, 1968L, 1969L, 1970L, 1971L,
 1972L, 1973L, 1974L, 1975L, 1976L, 1977L, 1978L, 1979L, 1980L,
 1981L, 1982L, 1983L, 1984L, 1985L, 1986L, 1987L, 1988L, 1989L,
 1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 1998L,
 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 1959L, 1960L, 1961L,
 1962L, 1963L, 1964L, 1965L, 1966L, 1967L, 1968L, 1969L, 1970L,
 1971L, 1972L, 1973L, 1974L, 1975L, 1976L, 1977L, 1978L, 1979L,
 1980L, 1981L, 1982L, 1983L, 1984L, 1985L, 1986L, 1987L, 1988L,
 1989L, 1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L,
 1998L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L), Site = structure(c(1L,
 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L,
 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
 4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c(A, B, C, D), class = 
 factor),
 Factor = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L,
 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L,
 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L,
 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c(Both,
 Environmental, Fishing Effort), class = factor)), .Names = c(Year,
 Site, Factor), class = data.frame, row.names = c(NA, -184L
 ))


-- 
Sarah Goslee
http://www.functionaldiversity.org

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology