Re: [R-sig-eco] [EXTERNAL] Time to Independence in R

2021-03-03 Thread Philippi, Tom
Manuel--
I apologize in advance for not answering the exact question you ask about 
packages.  [It is included in some geostatistics packages in terms of 
semivariance, nugget, sill, etc.]

In ecological data, time to independence is very scale dependent.  There's 
autocorrelation at scales of seconds due to instrument temperature-dependence 
if that hasn't been calibrated for, or the same individuals in the camera trap 
frame.  That component of dependence may have a half-life of minutes.  There's 
often autocorrelation based on time of day & temperature, with cycles of 24 
hours.  There may be pulse events from storms that persist a few days.  There's 
seasonality driving temperatures, day lengths, and plant & animal behavior, 
with cycles of 1 year.  Then where I live there are ENSO-driven temporal 
dependence at scales of 1.5 - 3 years, PDO at about a decade, and ENSO-La Nina 
dominated periods of 4-6 decades that drive not just ocean ecology, but 
rainfall & thus terrestrial ecology.  Then there's tends up to climate change.

So, in my experience in optimizing sampling designs for monitoring for trends, 
the majority of the temporal dependence is driven by cycles or pulses of 
characteristic duration, and that is more useful for determining the sampling 
frequency than empirical estimation form a "continuous" datastream of limited 
duration.  That approach also helps me think about the spatial concordance of 
the correlated errors: which are site-specific, which are concordant across all 
of the sites.

Tom 

-Original Message-
From: R-sig-ecology  On Behalf Of Manuel 
Spínola
Sent: Wednesday, March 3, 2021 1:06 PM
To: r-sig-ecology@r-project.org
Subject: [EXTERNAL] [R-sig-eco] Time to Independence in R



 This email has been received from outside of DOI - Use caution before clicking 
on links, opening attachments, or responding.



Dear list members,

It is common in ecology to sampling in almost a continuous manner when using 
data loggers, camera traps, sound recorders, gps radio-collars. etc.

Is there any R package to assess time to independence for the data to avoid 
temporal autocorrelation?

I know that there are models to take into account the temporal autocorrelation 
of the data, but I am asking to optimize the data collection, before modeling.

Thank you very much in advance.

Manuel

--
*Manuel Spínola, Ph.D.*
Instituto Internacional en Conservación y Manejo de Vida Silvestre Universidad 
Nacional Apartado 1350-3000 Heredia COSTA RICA mspin...@una.cr 
 mspinol...@gmail.com
Teléfono: (506) 8706 - 4662
Personal website: Lobito de río 

Institutional website: ICOMVIS 


[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-sig-ecologydata=04%7C01%7Ctom_philippi%40nps.gov%7C7c92bae61c0d458d045908d8de8881b0%7C0693b5ba4b184d7b9341f32f400a5494%7C0%7C0%7C637504025171600039%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000sdata=kPnj2hOk7NHbscFpSMuSkQvqrVBymYw2Eb7XeSrmDp0%3Dreserved=0

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Ecological datasets for teaching statistics

2020-06-18 Thread Philippi, Tom
Many ecological journals either encourage or require that the dataset behind a 
paper be submitted to a repository such as dryad, or included as an electronic 
appendix.  Even if your university does not have institutional subscription to 
all journals, some journals like Ecosphere are open-access, many journals allow 
authors to pay to make their papers open-access, and some journals with paywall 
papers allow free access to the abstract and the supplements including datasets 
or links to the datasets in repositories.  That has an advantage of letting you 
work from a topic or form of data for your teaching to find suitable datasets.  
Also, those datasets tend to be cleaned and documented and close to ready for 
the analyses, as they were used in the analyses in the publications.

The rdryad package from rOpenSci (on CRAN) has the ability to search dryad, but 
I suspect that search works better with ecological keywords than statistical 
ones.

Tom

-Original Message-
From: R-sig-ecology  On Behalf Of Rich 
Shepard
Sent: Thursday, June 18, 2020 10:38 AM
To: r-sig-ecology@r-project.org
Subject: [EXTERNAL] Re: [R-sig-eco] Ecological datasets for teaching statistics

On Thu, 18 Jun 2020, Manuel Spínola wrote:

> I teach statistics to students in ecology and environmental sciences 
> fields and I would like to know if you could point me in the right 
> direction of sources of ecological/environmental datasets within and 
> outside packages, especially for general/generalized linear models and 
> multivariate statistics.

Manuel,

Ecology, and it's applied focus Environmental science, are very broad. I've 
been working with these data for several decades so I need to ask what types of 
data you want.

I don't know what's available from Costa Rican agencies but I do know that in 
the US you can get geochemical, biological, hydrologidal, and other data from 
the US Geological Survay, Environmental Protection Agency (if they've not 
removed them), Department of Agriculture's Forest Service and Natural Resources 
Conservation Service.

You can also look at StreamNet run by the Pacific States Marine Fisheries 
Council, The Army Corps of Engineers for hydraulic, flow, and sediment 
transport data.

That's a start.

Rich

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] [EXTERNAL] Re: trasformation of matrix for iNEXT extrapolation interpolation of Hill's numbers

2019-06-20 Thread Philippi, Tom
Or, if the component matrices need to be numeric with species names as row
names and sites as column names, Dave Roberts' labdsv package has matrify()
that does that and fills in the unseen combinations with 0s as opposed to
NAs.  Converting {site, taxon, abundance} triplets to matrix form is a very
common operation for vegan, labdsv, and other vegetation packages.  Thus,
you could build the list of 4 matrices with something like:

ciliates <- list(
 M1 = labdsv::matrify(d1),
 M2 = labdsv::matrify(d2),
 M3 = labdsv::matrify(d3),
 M4 = labdsv::matrify(d4)
)

Tom



On Thu, Jun 20, 2019 at 11:35 AM Law, Jason 
wrote:

> I'm not sure what your data looks like. But the reshape package can be
> used to easily create matrix or array data from a 'long' format. Combining
> that with split would make it relatively to create matrices.
>
> library(iNEXT)
> library(reshape2)
> data(ciliates)
> # Maybe your data is in long format like this:
> df <- melt(ciliates, varnames = c('species', 'sample'), value.name =
> 'count')
> # Then this will work
> df.list <- lapply(split(df, df$L1), function(x) acast(x, species ~ sample,
> value.var='count'))
>
> # If your data is just one wide (sorted) matrix, then you just need lapply
> and the column indices
> mat <- acast(df, species ~ L1 + sample, value.var='count')
> df.list <- lapply(list(c(1,17), c(18,36), c(37,51)), function(i)
> mat[,i[1]:i[2]])
>
> Regards,
>
> Jason
>
> -Original Message-
> From: R-sig-ecology  On Behalf Of
> Irene Adamo
> Sent: Thursday, June 20, 2019 07:28
> To: r-sig-ecology@r-project.org
> Subject: [R-sig-eco] trasformation of matrix for iNEXT extrapolation
> interpolation of Hill's numbers
>
> Hi all,
>
> I am working with iNEXT package for diversity analyses however, I would
> like to use a incidence_raw data type which in iNEXT is a list of matrices
> like this:
>
> data(ciliates)
> head(ciliates)
>
> $SouthernNamibDesert
>  x9 x17 x19 x20 x21 x22
> x23 x24
> Acaryophrya.collaris  0   0   0   0   0   0
>   0   0
> Actinobolina.multinucleata.n..sp. 0   0   0   0   0   0
>   0   0
> Afroamphisiella.multinucleata.n..sp.  0   0   0   0   0   0
>   0   0
> Afrothrix.multinucleata.n..sp.0   1   0   0   0   0
>   0   0
> Amphisiella.binucleata.multicirrata.n..ssp.   0   0   0   0   0   0
>   0   0
> Amphisiella.elegans.n..sp.0   0   0   0   0   0
>   1   0
> Amphisiella.longiseries.n..sp.0   0   0   0   0   0
>   0   0
> Amphisiella.magnigranulosa0   0   0   0   0   0
>   0   1
> Amphisiella.multinucleata.n..sp.  0   0   0   0   0   0
>   0   0
> Amphisiella.namibiensis.n..sp.0   0   0   0   0   0
>   0   0
> Amphisiella.polycirrata   0   0   0   0   0   0
>   0   0
> Amphisiella.procera.n..sp.1   0   0   0   0   0
>   0   0
>
> do you know how can I create anlist with four matrices like the one they
> have in example data?
>
> thanks a lot for any help!
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] [EXTERNAL] Producing images of Plant leaves in R

2019-06-11 Thread Philippi, Tom
Eric--
If you don't care about "leaf-like" shape, you have 6 X,Y points defined
plus a pair of points at the maximum width at some undefined distance from
the apex.  Assuming bilateral symmetry, and assuming that the maximum width
is at 50% from apex to base, you have 5 points {0,0, b10width/2,length*.1,
maxW/2,length/2,  b90width/2,length*.9, and 0,length}, so you could
trivially fit a 4th order polynomial to those points, plot it and the
mirror image, and have blobs with those specific measurements.

I suspect that those blobs will not really look like your leaves.  Looking
at  https://en.wikipedia.org/wiki/Glossary_of_leaf_morphology , maybe
acute, obtuse, perforate, & oblanceolate might have max width at 50% of
length.  You _might_ be able to guess a position along the base-apex axis
for the maximum width (e.g., 35%) and get more reasonable shapes.  But I
suspect you need more measurements to characterize most leaf shapes.

Alternatively, if you only have one shape of leaves, you might be able to
take that shape, then scale & warp it to your truss of points.

Tom

On Tue, Jun 11, 2019 at 1:09 PM Eric Doucette 
wrote:

> Hello, does anyone know of a package(s) or methods to reproduce the shapes
> of plant leaves in R? I have a large dataset (over 7500) of individual
> plant leaves for which I have the following measurements: leaf length, leaf
> width, leaf width at 1/10 of the overall length below the apex, and leaf
> width at 1/10 of the overall length above the base. I am looking for a way
> to code to get a visual representation of these lengths. They do not have
> to actually be the same as the "real" measurements as long as the ratio of
> measurements is correct. Thanks for any advice you can provide,
> Eric
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] [EXTERNAL] Re: Query

2019-05-30 Thread Philippi, Tom
Bruce--
You don't specify if you want the intersected polygons: 1 record for each
polygon intersection of species rangemap and habitat polygon, or just a
list of which habitat type polygons are at least partially within the
species' rangemap polygons.
The former is much slower but can give you estimates of proportions of
habitat (not necessarily _good_ estimates!).

In general, brute force polygon-polygon intersections are slow, as the
computation involves each segment of one polygon compared against each
segment of the other polygon.  Depending on your question, you may be able
to use gOverlaps() instead of the slower gIntersection().  One of the nice
things about coding this in R is that you can use bounding boxes to rapidly
eliminate pairs of polygons to test, and points in polygons as rapid
flagging of intersected/overlapped polygons if you only need the list of
habitats.  Then, the slower over() or gInstersect() computations only need
to be performed on the few pairs that overlap bounding boxes but not point
in polygons.  For the Rodhouse et al. 2016 Ecosphere Bats in Parks paper,
half a day of thinking, coding, & testing resulted in ~10k increase in
speed.  If your habitat polygons tile the entire area, I suspect first
using bounding boxes of your rangemaps to select only relevant habitat
polygons, then using point in polygon of habitat polygon centroids vs
rangemap polygons (assuming habitat polygons are much smaller than
rangemaps) will also greatly speed your computations.

Tom


On Thu, May 30, 2019 at 5:57 AM Sarah Goslee  wrote:

> Hi Bruce,
>
> There's extensive material online about spatial capabilities in R.
> Without being sure what you've looked at so far, do these help:
>
>
> https://stackoverflow.com/questions/15075361/how-to-perform-a-vector-overlay-of-two-spatialpolygonsdataframe-objects
>
> https://rspatial.org/rosu/Chapter11.html
>
> In general, you need the sp and rgeos packages. I know that the sf
> package also has very good capabilities for overlaying data, but I'm
> less familiar with its usage.
>
> Sarah
>
> On Thu, May 30, 2019 at 8:36 AM Bruce Miller  wrote:
> >
> > Hi all,
> >
> > I am still searching for a way to extract associations between 2 vector
> > polygon layers using R.
> >
> > What I need is information on if this is possible and if so which
> > packages to use and a basic set of code to do what I can run in the GIS
> > software.
> >
> > For each iteration I have a vector polygon of a bat species
> distributions.
> > Depending on the country  the number of these can range from only a
> > dozen to > 150 species distribution shape files.
> >
> > What I need to do is use the most current habitat/ecosystem vector
> > layers for each geographic area and then extract the species overlapping
> > values.  This creates a new shape file that has added to the species
> > data table all values for the ecosystem where the species polygons
> > overlap the ecosystem polygons.  This provides habitat associations for
> > each species.
> >
> > Then the table is exported to Excel and the data is used for Interactive
> > Identification Keys for New World Bat calls.
> > This eliminates species that do not occur in a given ecosystem as people
> > work on IDing their bats form acoustic surveys.
> >
> > I assume this can be accomplished using R.
> >
> > I have a new ecosystem data set of polygons for Peru, for example that
> > is very detailed.  I extracted one species late yesterday.
> > It took >8 hours in the GIS to compile the areas where the species
> > distribution and ecosystem areas overlapped!
> > I would like to do this in R as a batch file and run on another computer
> > so not to tie up my main workstation.
> >
> > Any guidance or suggestions welcomed.
> >
> > Bruce
> >
> > ___
> > R-sig-ecology mailing list
> > R-sig-ecology@r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>
>
> --
> Sarah Goslee (she/her)
> http://www.numberwright.com
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[R-sig-eco] Fwd: [EXTERNAL] Which R package for Second-Stage nMDS ?

2019-03-22 Thread Philippi, Tom
Pierre--
That's actually an easy question.  Look at the vegan package
https://cran.r-project.org/web/packages/vegan/index.html.  It has functions
for various dissimilarity & distance metrics (and can use distance matrices
from other packages such as ecodist), MDS and allied processes, anosim and
restricted mantel tests, and many many other approaches.

More broadly, the cran task view for environmetrics is a great place to
start for finding packages directed at ecological questions and analyses.
https://cran.r-project.org/web/views/Environmetrics.html

Tom


On Fri, Mar 22, 2019 at 3:43 AM Pierre THIRIET 
wrote:

> Dear useRs,
>
> I want to perform 2nd stage nMDS, as described in Clarke, K.R., et al
> (2006). Exploring interactions by second-stage community analyses.
> Journal of Experimental Marine Biology and Ecology 338, 179-192. See
> Abstract below
>
> Do you know a package in R for that ? Or would you have home-made
> scripts, at least a function for computing the distance matrix of
> pair-wise correlations among dissimilarity matrices ?
>
> Thank you,
>
> Pierre
>
>
> Abstract of Clarke et al 2006 :
>
> Many biological data sets, from field observations and manipulative
> experiments, involve crossed factor designs, analysed in a univariate
> context by higher-way analyses of variance which partition out ‘main’
> and ‘interaction’ effects. Indeed, tests for significance of
> interactions among factors, such as differing Before–After responses at
> Control and Impact sites, are the basis of the widely used BACI strategy
> for detecting impacts in the environment. There are difficulties,
> however, in generalising simple univariate definitions of interaction,
> from classic linear models, to the robust, non-parametric multivariate
> methods that are commonly required in handling assemblage data. The size
> of an interaction term, and even its existence at all, depends crucially
> on the measurement scale, so it is fundamentally a parametric construct.
> Despite this, certain forms of interaction can be examined using
> non-parametric methods, namely those evidenced by changing assemblage
> patterns over many time periods, for replicate sites from different
> experimental conditions (types of ‘Beyond BACI’ design) – or changing
> multivariate structure over space, at many observed times. *Second-stage
> MDS, which can be thought of as an MDS plot of the pairwise similarities
> between MDS plots (e.g. of assemblage time trajectories), can be used to
> illustrate such interactions, and they can be formally tested by
> second-stage ANOSIM permutation tests. Similarities between
> (first-stage) multivariate patterns are assessed by rank-based matrix
> correlations, preserving the fully non-parametric approach common in
> marine community studies. *The method is exemplified using time-series
> data on corals from Thailand, macrobenthos from Tees Bay, UK, and
> macroalgae from a complex recolonisation experiment carried out in the
> Ligurian Sea, Italy. The latter data set is also used to demonstrate how
> the analysis copes straightforwardly with certain repeated-measures
> designs.
>
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] How to extract values from contiguous raster cells that ate not touched by SpatialLines?

2017-08-18 Thread Philippi, Tom
Andre--
I'm not completely clear on what you are trying to do.  My best guess
interpretation of "my raster sometimes goes from one
to several contiguous cells in all directions"  is that you have a raster
that you are aggregating into blobs of contiguous cells based on values in
one layer.  You then want to extract (area) values by blob for each blob a
polyline intersects.
If that's close to what you want, there are several approaches that will
work, with the most efficient approach dependent on the size of your
rasters, whether you can use gpu computing for very large rasters, etc.
I would first create a new thematic (factor) raster layer of blob
membership, then aggregate or compute your desired per blob measures with
that thematic layer and your other layer(s).  Then, extract blob indices
(values from the thematic layer) for those intersected by your lines with
extract:
blob.hits1 <- extract(blobraster, Spline1,df=TRUE,...), # note using the
polyline Spline1 not the rasterized version
Then subset your blob measures by those indices.

Even if I'm completely wrong in my guess of what you are attempting, I
would wager that you should use extract with the unrasterized Spline1
object.

I hope that this helps, or at least helps you revise your question to get a
better response from R-sig_Geo.

Tom 2

On Fri, Aug 18, 2017 at 11:30 AM, Bede-Fazekas Ákos 
wrote:

> Hi Andre,
> I don't know the answer, but I think you should post your question to
> R-sig-Geo mailing list (r-sig-...@r-project.org), where several authors
> of the spatial R-packages and other GIS experts are members and will read
> your post.
> HTH,
> Ákos Bede-Fazekas
> Hungarian Academy of Sciences
>
>
> 2017.08.18. 17:09 keltezéssel, Andre Rovai írta:
>
>> Hi all,
>>
>> I've been trying to extract values from a single attribute raster (area,
>> in
>> m2) that overlaps with lines (that is, a .shp SpatialLines).
>>
>> The problem is that, along these lines, my raster sometimes goes from one
>> to several contiguous cells in all directions.  Using the
>> extract function only values from cells that are touched by the lines are
>> extracted.  Thus, when I add up the extracted values from all lines a
>> significant amount of area (m2) is lost due to cells that were not touched
>> by the line and therefore values were not extracted.
>>
>> I tried to work it around by:
>>
>> Step 1 - first aggregating my raster to a lower resolution (i.e.
>> increasing
>> the fact argument) and then
>> Step 2 - rasterizing the lines using this aggregated raster (created in
>> step 1) as a mold to make sure the rasterized lines would get thick enough
>> to cover the horizontal spread of cells in my original resolution raster.
>> Step 3 - Then I resample the rasterized lines (created in step 2) back to
>> the original resolution I started with.
>> Step 4 - Finally, extracted the values from the resampled rasterized lines
>> (created in step 3).
>>
>> However, it didn't quite work as now the total area (m2) varies according
>> to the fact="" value I use when first aggregating the raster (in step 1).
>>
>> I really appreciate if anyone has already dealt with a similar problem and
>> can help me out here.  Here are the codes I've been running to try to get
>> it to work:
>>
>>
>> # input raster file
>>
>> g.025 <- raster("ras.asc")
>>
>> g.1 <- aggregate(g.025, fact=2, fun=sum)
>>
>>
>>
>> # input SpatialLines
>>
>> Spline1 <- readOGR("/Users/x.shp")
>>
>> Spline2 <- readOGR("/Users/x.shp")
>>
>> Spline3 <- readOGR("/Users/x.shp")
>>
>>
>>
>> # rasterizing using low resolution raster (aggregated)
>>
>> c1 <- rasterize(Spline1, g.1, field=Spline1$type, fun=sum)
>>
>> c2 <- rasterize(Spline2, g.1, field=Spline2$type, fun=sum)
>>
>> c3 <- rasterize(Spline3, g.1, field=Spline3$type, fun=sum)
>>
>>
>>
>> # resampling back to higher resolution
>>
>> c1 <- resample(c1, g.025)
>>
>> c2 <- resample(c2, g.025)
>>
>> c3 <- resample(c3, g.025)
>>
>>
>>
>> # preparing to extract area (m2) values from raster “g.025”
>>
>> c1tab <- as.data.frame(c1, xy=T)
>>
>> c2tab <- as.data.frame(c2, xy=T)
>>
>> c3tab <- as.data.frame(c3, xy=T)
>>
>> c1tab <- c1tab[which(is.na(c1tab$layer)!=T),]
>>
>> c2tab <- c2tab[which(is.na(c2tab$layer)!=T),]
>>
>> c3tab <- c3tab[which(is.na(c3tab$layer)!=T),]
>>
>>
>>
>> # extracting area (m2) values from raster “g.025”
>>
>> c1tab[,4] <- extract(g.025, c1tab[,1:2])
>>
>> c2tab[,4] <- extract(g.025, c2tab[,1:2])
>>
>> c3tab[,4] <- extract(g.025, c3tab[,1:2])
>>
>> names(c1tab)[4] <- "area_m2"
>>
>> names(c2tab)[4] <- "area_m2"
>>
>> names(c3tab)[4] <- "area_m2"
>>
>>
>>
>> # sum total area (m2)
>>
>> c1_area <- sum(c1tab$area_m2)
>>
>> c2_area <- sum(c2tab$area_m2)
>>
>> c3_area <- sum(c3tab$area_m2)
>>
>> tot_area <- sum(c1_area, c2_area, c3_area)
>>
>>
>> Thanks!
>>
>> Andre Rovai
>> Department of Oceanography and Coastal Sciences
>> Louisiana State University
>>
>> [[alternative HTML version deleted]]
>>
>> 

Re: [R-sig-eco] Package to analyse population time series (trend analysis)

2016-10-19 Thread Philippi, Tom
Manuel--
The proper package depends entirely on what your data are: how many sites
you collect counts for, whether those sites are a probability sample over
an area or the only locations you are making inferences about (e.g., census
counts), whether those counts have Poisson error, overdispersion (e.g.,
aggregation) or zero-inflation (e.g., extra 0s from bad weather or bad
sites), imperfect detection, marked individuals, whether you expect a
linear or only monotonic trend, whether you have covariates that vary by
year (e.g., annual precipitation or winter NINO3.4), or or other aspects I
haven't yet dealt with.

A bit more information about what you have in terms of data, and about the
questions you are interested within the broad definition of "trend" might
get you informed answers.

Tom 2



On Wed, Oct 19, 2016 at 5:47 AM, Manuel Spínola 
wrote:

> Dear list members,
>
> What is the appropriate package to analyze population time series (trend
> analysis) when you have one count per year.
>
> Best,
>
> Manuel
>
> --
> *Manuel Spínola, Ph.D.*
> Instituto Internacional en Conservación y Manejo de Vida Silvestre
> Universidad Nacional
> Apartado 1350-3000
> Heredia
> COSTA RICA
> mspin...@una.cr 
> mspinol...@gmail.com
> Teléfono: (506) 8706 - 4662
> Personal website: Lobito de río  site/lobitoderio/>
> Institutional website: ICOMVIS 
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

Re: [R-sig-eco] NMDS axes scores

2016-01-17 Thread Philippi, Tom
Conny--

Note that Jari's surface fitting is using ordination scores on the
right-hand predictor size of the formula, with some z as the response.

If you need something about species composition as your _response_ variable
in a linear model (e.g., with time, disturbance type, and treatment as
predictors, and perhaps site as a random effect), why not use each stand's
dissimilarity/distance from your reference forest sites?  The trend line
would be compositional distance or dissim v. time, with
color/symbols/whatever for different treatments.  That would have the
advantage of being easily & directly interpretable.  [The use-case where
that would fail is >>100% turnover so lots of 0 similarities to the
reference forests, so step-across or nmds might help put those large
distances in order.]  You might be able to set up the equivalent to your
GLM in adonis to get permutation significance tests.

I hope that this helps, or at least gives you a different way to think
about your problem, or else is so stupid that Jari gets annoyed and blasts
it with a valid solution.

Tom 2

 --
Tom Philippi
Quantitative Ecologist & Data Therapist
National Park Service


On Sun, Jan 17, 2016 at 8:04 PM, Conny  wrote:

> Thanks a lot for all the helpful responses and info.
>
> But I’m actually still not sure how to use both NMDS axes as a response
> (y) in a regression model - is this even possible??
>
> My overall goal is to model species compositional change over time in a
> restoration project (is the system getting more similar to the reference
> forest). I would like to create a trend line here in a graph, rather than
> just using an ordination plot.
>
> I thought about using the fitted values returned by ordisurf(), but as I
> understood it (please correct me if I’m wrong) it will use my restoration
> time again as a response and my axes scores as predictors.
>
>  So the z values will represent fitted age values rather than my sample
> scores (?) – so it would make no sense to plot it against my restoration
> time…
>
> I’m sorry if this is getting a bit confusing.
>
> Cheers,
> Conny
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

Re: [R-sig-eco] automating ações

2015-09-15 Thread Philippi, Tom
Katia--
Your attached script didn't come through. as attachments are stripped by
the mailing list.

However, I can give you 2 generic answers.

First, if your vector of species names is called SpList, and your data are
in a big SpatialPointsDataFrame spdf

for (SP in SpList) {
 spSP <- as(spdf[spdf$Species==SP,],'ppp')  # subset to just that
species and return as ppp object
temp1 <- bw.smoothppp(spSP,...)
temp2 <- nextstep(temp1)
tempplot <- plot(...)
# probably display or write to a file
png(filename=paste(SP,'smootheddensity.png',sep=''))
print(tempplot)
dev.off()
} # bottom for SP in SpList

That structure works well for things like hitting gbif, that need to be
done sequentially.  It can work for your use case, but will be slow, and is
considered poor R "style".

In general, in R, one should not use for loops, but rather let R vectorize
in line via one of the *apply functions or even more powerful tools in plyr
or other packages.  The "better" R style is to write a small wrapper
function DoPP() that takes a species name and spatial object as parameters,
performs the steps you want for each species, then returns the graph
object.  Then, you can use dlply() in the plyr package to apply that
function to each species, and return a list of graph objects.

I hope that this points you in a useful direction.

Tom 2
"If you find that you're spending almost all your time on theory, start
turning some attention to practical things; it will improve your theories.
If you find that you're spending almost all your time on practice, start
turning some attention to theoretical things; it will improve your
practice."
  -- Donald Knuth



On Tue, Sep 15, 2015 at 11:01 AM, Kátia Emidio  wrote:

> Dear all,
>
> I´m a beginner in using FOR in order to create a loop in R to automate
> ações in it.
>
> I have a script, attached, and I need to perform it for all species in the
> file  (test_richness). The package I'm using is the SpatStat. The loop
> needs to be created before the "bw.smoothppp" function (line 45), which get
> the sigma value, that is used as a parameter for the  "plot(Smooth...)..
> and the image needs to be automatic generated as a file.
> I'll appreciate any help!
> Cheers,
>
> --
> Kátia Emídio da Silva DSc
> Eng. Florestal
> Manaus/AM
>
>
>
> Forestry Engineer
> Manaus/AM-Brazil
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

Re: [R-sig-eco] T50 or ED50 for count data

2015-07-24 Thread Philippi, Tom
Mehdi--
You _can_ calculate estimates of the parameters at median germination from
the lme or gamm objects, but the confidence intervals are not easy.
I recommend using the survival package, specifically survfit.formula and
quantile.survfit, which gives a couple of examples

fit - survfit(Surv(time, status) ~ ph.ecog, data=lung)

quantile(fit)

Tom 2

On Fri, Jul 24, 2015 at 12:43 AM, Mehdi Abedi abedim...@gmail.com wrote:

 Dear Tom,
 Thanks for your kind reply.
 Could we calculate T50 in gamm4 and lme4 as well?
 I searched in survival but no special information about median term in the
 package survival.

 Warm regards,
 Mehdi

 On Fri, Jul 24, 2015 at 4:11 AM, Philippi, Tom tom_phili...@nps.gov
 wrote:

 Mehdi--
 Based on the example datasets germination and chickweed, drc would work
 with you using the final count of emergence as the number of seeds at risk
 of germinating.  If you get a satisfactory fit from one of the 6 or so
 function forms it supports for dose-response, you may be fine.  [I have
 scientific qualms about treating time as a dose, but you may have a valid
 justification.]

 However, I don't know that drc handles replicate subjects (petri dishes)
 followed over time, where you need a random effect for subject.  [You don't
 say whether you have replicate soil samples followed over time.]  If you
 have replicate soil samples,  you could fit something like a generalized
 linear mixed model with lme4 or related, or generalized additive mixed
 model with mgcv or gamm4. family='binomial', and for each subject 
 species, the sum of the seeds that germinated as the total.
 Alternatively, there are a number of packages for survival or time to
 event analysis.  Your T50 is equivalent to the median survival time (or
 median time to germination).  Look at the survival analysis task view:
 *https://cran.r-project.org/web/views/Survival.html
 https://cran.r-project.org/web/views/Survival.html*
 You may be able to use the survival package, as median survival times are
 available in most packages.  Gordon Fox (currently at USF) wrote a book
 chapter on applying such methods to germination times (albeit using SAS not
 R at the time), and might be willing to send you a pdf.
 I hope that this helps you with your analyses.

 Tom 2


 On Thu, Jul 23, 2015 at 2:03 PM, Mehdi Abedi abedim...@gmail.com wrote:

 Dear Colleagues,

 I am interested to calculate time to 50 percentage for count data.
 I am working on soil seed bank germination. I have recorded number of
 seedling emergence in 10 different times for species. The total number of
 seeds in soil is not known therefore it is not proportional data but we
 can
 argue that after this period all viable seeds already germinated(total
 seedling emergence).

 I am familiar with drc package for calculation T50 but only for
 proportional data like germination in petridishes which total number of
 seed in each pteridishes is known  T50 could be calculated.

 Could we calculate T50 for count data as well? because we have total
 seedling emergence in the end, therefore their distribution during time
 is
 important and having this index we can find when species germinate ...
 This indices could be useful for other count data specially survival
 studies.

 warm regards
 Mehdi


 *Mehdi Abedi Department of Range Management*

 *Faculty of Natural Resources  Marine Sciences *

 *Tarbiat Modares University (TMU) *

 *46417-76489, Noor*

 *Mazandaran, IRAN *

 *mehdi.ab...@modares.ac.ir mehdi.ab...@modares.ac.ir*

 *Homepage
 http://www.modares.ac.ir/en/Schools/nat/Academic_Staff/~mehdi.abedi*

 *Tel: +98-122-6253101 *

 *Fax: +98-122-6253499*

 [[alternative HTML version deleted]]

 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology




 --
 Data is not information. Information is not knowledge. And knowledge is
 certainly not wisdom.
-- Clifford Stoll

 Science is what we understand well enough to explain to a computer. Art
 is everything else we do.
-- Donald Knuth

 If you find that you're spending almost all your time on theory, start
 turning some attention to practical things; it will improve your theories.
 If you find that you're spending almost all your time on practice, start
 turning some attention to theoretical things; it will improve your
 practice.
   -- Donald Knuth




 --


 *Mehdi Abedi Department of Range Management*

 *Faculty of Natural Resources  Marine Sciences *

 *Tarbiat Modares University (TMU) *

 *46417-76489, Noor*

 *Mazandaran, IRAN *

 *mehdi.ab...@modares.ac.ir mehdi.ab...@modares.ac.ir*

 *Homepage
 http://www.modares.ac.ir/en/Schools/nat/Academic_Staff/~mehdi.abedi*

 *Tel: +98-122-6253101 *

 *Fax: +98-122-6253499*




-- 
Data is not information. Information is not knowledge. And knowledge is
certainly not wisdom.
   -- Clifford Stoll

Science is what we understand well enough to explain to a computer. Art

Re: [R-sig-eco] T50 or ED50 for count data

2015-07-23 Thread Philippi, Tom
Mehdi--
Based on the example datasets germination and chickweed, drc would work
with you using the final count of emergence as the number of seeds at risk
of germinating.  If you get a satisfactory fit from one of the 6 or so
function forms it supports for dose-response, you may be fine.  [I have
scientific qualms about treating time as a dose, but you may have a valid
justification.]

However, I don't know that drc handles replicate subjects (petri dishes)
followed over time, where you need a random effect for subject.  [You don't
say whether you have replicate soil samples followed over time.]  If you
have replicate soil samples,  you could fit something like a generalized
linear mixed model with lme4 or related, or generalized additive mixed
model with mgcv or gamm4. family='binomial', and for each subject 
species, the sum of the seeds that germinated as the total.
Alternatively, there are a number of packages for survival or time to event
analysis.  Your T50 is equivalent to the median survival time (or median
time to germination).  Look at the survival analysis task view:
*https://cran.r-project.org/web/views/Survival.html
https://cran.r-project.org/web/views/Survival.html*
You may be able to use the survival package, as median survival times are
available in most packages.  Gordon Fox (currently at USF) wrote a book
chapter on applying such methods to germination times (albeit using SAS not
R at the time), and might be willing to send you a pdf.
I hope that this helps you with your analyses.

Tom 2


On Thu, Jul 23, 2015 at 2:03 PM, Mehdi Abedi abedim...@gmail.com wrote:

 Dear Colleagues,

 I am interested to calculate time to 50 percentage for count data.
 I am working on soil seed bank germination. I have recorded number of
 seedling emergence in 10 different times for species. The total number of
 seeds in soil is not known therefore it is not proportional data but we can
 argue that after this period all viable seeds already germinated(total
 seedling emergence).

 I am familiar with drc package for calculation T50 but only for
 proportional data like germination in petridishes which total number of
 seed in each pteridishes is known  T50 could be calculated.

 Could we calculate T50 for count data as well? because we have total
 seedling emergence in the end, therefore their distribution during time is
 important and having this index we can find when species germinate ...
 This indices could be useful for other count data specially survival
 studies.

 warm regards
 Mehdi


 *Mehdi Abedi Department of Range Management*

 *Faculty of Natural Resources  Marine Sciences *

 *Tarbiat Modares University (TMU) *

 *46417-76489, Noor*

 *Mazandaran, IRAN *

 *mehdi.ab...@modares.ac.ir mehdi.ab...@modares.ac.ir*

 *Homepage
 http://www.modares.ac.ir/en/Schools/nat/Academic_Staff/~mehdi.abedi*

 *Tel: +98-122-6253101 *

 *Fax: +98-122-6253499*

 [[alternative HTML version deleted]]

 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology




-- 
Data is not information. Information is not knowledge. And knowledge is
certainly not wisdom.
   -- Clifford Stoll

Science is what we understand well enough to explain to a computer. Art is
everything else we do.
   -- Donald Knuth

If you find that you're spending almost all your time on theory, start
turning some attention to practical things; it will improve your theories.
If you find that you're spending almost all your time on practice, start
turning some attention to theoretical things; it will improve your
practice.
  -- Donald Knuth

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Ancova Plot R

2015-03-16 Thread Philippi, Tom
Luis--
I assume that especie is a factor, else it would be multiple regression.
You do not give much information on what you want your graph to look like.
If you want the linear regression line for each especie, with confidence
bands around those lines, you can use ggplot2 (untested code):
library(ggplot2)
a$lPeso - log(Peso)
p - gplot(a,aes(x=Radio,y=lPeso)) + geom_point(aes(color=especie))
p + geom_smooth(aes(group=especie),method='lm')

I hope that this gets you started in the right direction.

Tom 2

On Mon, Mar 16, 2015 at 9:38 PM, Luis Fernando García luysgar...@gmail.com
wrote:

 Dear R users.

 I have an urgent question to make you, hopefully you can help me.

 This is my first time making an ancova test in R. I have finished the model
 design and now I need to plot it. I have found several online sources where
 the mechanics for plotting data is explained, nevertheless all of these
 explain the mechanics for plotting the data when there is no interacion
 which is not my case. I was thinking to subset the data following this
 example (

 http://r-eco-evo.blogspot.com/2011/08/comparing-two-regression-slopes-by.html
 ),
 but I do not know if it will be the best choice.

 If some of you could suggest me a document which explains this in detail or
 may be help me with the data set,

  a-read.table (test1ancova.txt, header=T)
  attach(a)
  names(a)
  fit3-lm(log(Peso) ~ especie * Radio)
  summary(fit2)

 Call:
 lm(formula = log(Peso) ~ especie * Radio)

 Residuals:
  Min   1Q   Median   3Q  Max
 -1.15349 -0.29033  0.0  0.39277  1.41267

 Coefficients:
   Estimate Std. Error t value Pr(|t|)
 (Intercept) 2.0610 0.2050  10.054 5.84e-11 ***
 especieNops 1.8253 0.3002   6.080 1.28e-06 ***
 Radio   0.3827 0.1414   2.707   0.0113 *
 especieNops:Radio  -0.3749 0.1438  -2.608   0.0142 *
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

 Residual standard error: 0.5418 on 29 degrees of freedom
 Multiple R-squared:  0.7005,Adjusted R-squared:  0.6695
 F-statistic: 22.61 on 3 and 29 DF,  p-value: 9.578e-08


   plot(a$Radio,a$Peso, type=n,xlab=Ratio, ylab=Weight)
   points(a$Radio[1:15], a$Peso[1:15], col=red)
   points(a$Radio[16:30], a$Peso[16:30], col=blue)

 Any help will be really appreciated!

 Thanks in advance

 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

Re: [R-sig-eco] Multivariate quasi-bionomial analysis of proportion data?

2015-02-09 Thread Philippi, Tom
Amanda--
I'm not sure I would be convinced by you analyses, as I don't think your
statistical model corresponds to your sampling or data generating process.
But, I'd need to know more information about the response design (data
collection) to make any suggestions.

For binomial or quasi-, you aren't analyzing the ratio of time observed
(DV) to total time observed, you're presumably using the number of minutes
or seconds?  If so, note that you get very different answers depending on
the units, because the binomial response is treating each point observation
as independent.  Depending on the animal and the behaviors, in my
experience not even minute or 10 minute observations are independent.

How long is an individual animal observed in a given bout (period of
consecutive recording)?  Are individuals monitored for more than 1 bout?
How many behaviors does it perform (on average) in one observation bout?
How many times does it switch behavior in a bout?  Even if it only does
behaviors A  B, if it is doing A when you start observing, at some point
it switches to B, and is still doing B when you stop recording, that is
very different than it switching back  forth A B A B A B A B A B in a
single bout.

If you have lots of switching by individual animals in individual bouts,
then there may be a reasonable mixed-model binomial-based approach,
treating individual animals as random subjects.  If not, there are some
approaches to proportional data that might be a better approximation to
your data and components of variation.  But I've already stuck my neck out
far enough guessing about how you might have collected your data, so I'll
stop here unless you provide more information.

I hope that this helps...

Tom 2

On Sun, Feb 8, 2015 at 3:27 AM, Amanda Greer manda.gr...@gmail.com wrote:

 Hi All,

 I am trying to best analyse a set of foraging ecology data with 10
 behaviour categories (DVs) and 3 levels of IV (season, sex, age). The time
 which an animal spent engaged in a behaviour was recorded and then divided
 by the total time spent in sight of the observer, so my data are
 proportional. As is typical, not all animals engaged in all behaviours and
 there are a large number of zeros in my dataset which is severely
 over-dispersed. I had initially analysed all the data using the glm
 function (family = quasibinomial, followed by anova. The intention was then
 to use the false discovery rate alpha to account for the large number of
 analyses. However, it was pointed out to me that a multivariate approach
 might be better so I have been trying to figure out (a) if it's possible to
 run a quasi-binomial multivariate analysis of proportion data  (b) how to
 go about it.

 In the R Documentation quasi-binomial family function page (

 http://artax.karlin.mff.cuni.cz/r-help/library/VGAM/html/quasibinomialff.html
 ) it is stated that if multivariate response = TRUE the response matrix
 should be binary. This seems a pretty straightforward indictment of my idea
 to run this analysis on my proportion data, but I am wondering why - is
 this just not possible, or is there a particular package that could help?
 If anyone could provide me with an answer or some much needed guidance on
 this topic I would be very grateful.

 Thanks,

 Amanda

 [[alternative HTML version deleted]]

 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology




-- 
---
Tom

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Can I weight LS means estimates by elevation in lme4/lmerTest packages?

2014-11-28 Thread Philippi, Tom
Matthew--
Do you get what you want in terms of estimated effect of ecosystem type by
dropping Elevation from your model?  That difference would include the
component of the response to the difference in mean elevation.

If not, from a pure parameter estimation perspective, if your simulated
data are representative, you have elevation ranging from 1200:1700 (m) in
grassland ecosystems, but 1800:3000 in shrublands.  Therefore, you have 2
df for elevation: a linear effect (Elevation) plus a step effect
(Ecosystem) split at 1750m, so forced parallel lines that don't overlap.
Your estimated effect for ecosystem is the offset of those parallel lines.
However, any nonlinearities in grass cover as a function of elevation will
be forced into that step (ecosystem) parameter, which could be misleading
or masking.

Is a linear effect of elevation reasonable for your range of elevations, or
are there likely to be curved or unimodal relationships where too low gets
too dry and too high gets too cold for grasses?  Before I did any model
fitting, I would plot 6 gam or loess lines of grass cover by elevation for
the 6 ecosystem * quality combinations.

I hope that this gives you food for thought, or alternatively that I'm way
off base and someone provides you with exactly the narrow answer you
requested.

Tom 2


On Fri, Nov 28, 2014 at 5:35 PM, Matthew Van Scoyoc sco...@gmail.com
wrote:

 Good afternoon r-sig-ecology,

 I'm running linear mixed models using the lme4 and lmerTest packages to
 examine ecosystem structure in grasslands and shrublands. The grasslands
 are located at lower elevations than the shrublands, and I would like to
 weight the estimates from LS means to reflect the differences in
 elevations. My colleague says there is an easy way to do it in SAS, but I
 haven't found a way to do it in R.

 There is an example dataset and workflow. Here I would be examining grass
 cover differences between the two ecosystems, the quality of the
 ecosystems, and examining the interaction between ecosystem and quality. I
 am also interested in systematic changes throughout the study area (not
 represented in this example) so I don't want to run separate analyses on
 for each ecosystem. I just want to adjust the LS means estimates to reflect
 the differences in elevation and not a mean elevation of the sampled plots.

 library(lme4)
 library(lmerTest)
 
  df = data.frame(PlotID = rep(c(paste0(G, 1:30), paste0(S, 1:30)), 2),
 + SamplePeriod = as.factor(c(rep(2012, 30), rep(2014,
 30))),
 + Ecosystem = rep(c(rep(Grassland, 30), rep(Shrubland,
 30)), 2),
 + Quality = rep(rep(c(rep(Good, 10), rep(Moderate, 10),
 rep(Poor, 10)), 2), 2),
 + GrassCover = c(runif(10, min = 0.50, max = 0.85), # 2012
 Grassland Good
 +runif(10, min = 0.50, max = 0.60), # 2012
 Grassland Moderate
 +runif(10, min = 0.30, max = 0.40), # 2012
 Grassland Poor
 +runif(10, min = 0.25, max = 0.60), # 2012
 Shrubland Good
 +runif(10, min = 0.20, max = 0.45), # 2012
 Shrubland Moderate
 +runif(10, min = 0.05, max = 0.25), # 2012
 Shrubland Poor
 +runif(10, min = 0.50, max = 0.90), # 2014
 Grassland Good
 +runif(10, min = 0.50, max = 0.55), # 2014
 Grassland Moderate
 +runif(10, min = 0.30, max = 0.30), # 2014
 Grassland Poor
 +runif(10, min = 0.25, max = 0.60), # 2014
 Shrubland Good
 +runif(10, min = 0.20, max = 0.30), # 2014
 Shrubland Moderate
 +runif(10, min = 0.05, max = 0.15))) # 2014
 Shrubland Poor
  Elevation = c(sample(1200:1700, size = 30, replace = T),
 sample(1800:3000, size = 30,
 replace = T))
  df$Elevation = c(Elevation, Elevation); rm(Elevation)
 
  lmm = lmer(GrassCover ~ Ecosystem*Quality + Elevation + (1|PlotID), data
 = df, REML = T)
  anova(lmm,  ddf = Satterthwaite, type = 3, method.grad = Richardson)
 Analysis of Variance Table of type 3  with  Satterthwaite
 approximation for degrees of freedom
  Sum Sq  Mean Sq  NumDF  DenDF
  F.valuePr(F)
 Ecosystem   1.68172 1.68172 1  113
  51.2398.79e-11 ***
 Quality   1.97119 0.98560 2   113
 152.6112.2e-16 ***
 Elevation 0.00311 0.00311 1  113
  0.016  0.89978
 Ecosystem:Quality 0.042010.02101 2  113  3.604
  0.03039 *
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
  lsms = lsmeans(lmm, method.grad = Richardson)
 estimates = lsms[[1]]$Estimate

 At this point I would be plotting the estimates of the significant response
 variables and interactions to look at the 

Re: [R-sig-eco] Using NADA and rkt for seasonal Mann Kendall tests for censored datasets

2014-07-24 Thread Philippi, Tom
Tim--

When I had decreasing MDL over a monitoring period, I looked at the
distribution of the values collected under the later, lower MDL (MDL2) that
were MDL1 (including BDL2), then used that distribution (estimated via
kerneling) to simulate/impute values under the earlier MDL1.  In my
application, log-normal was not a reasonable fit for those values.  In your
application log-normal may be fine, but I suggest plotting the pdf of those
values to be sure.  My approach can require pretty large sample sizes,
especially if you have to account for seasons and sites/regions.

If you haven't looked at it and rejected it for your application, you may
want to consider the EnvStats package as an alternative to rkt (and
possibly to NADA).  For my applications, I needed the additional tests for
heterogeneity it provides.

Also, if you need to do the equivalent of stl decomposition (graphing a
repeating seasonal component plus temporal trend), if you have irregular
times or missing values, or just don't have symmetric sinusoidal seasonal
patterns (for cosinor), Gavin Simpson has some nice work using GAMMs to fit
the data.
http://www.fromthebottomoftheheap.net/2014/05/09/modelling-seasonal-data-with-gam/

I hope that this helps a bit.
Tom 2

On Thu, Jul 24, 2014 at 8:59 AM, clarkbar clark.ti...@gmail.com wrote:

 I have recently put together some code for running trend analyses on
 censored
 datasets with shifting Method Detection Limits (MDLs). I have applied the
 NADA package to determine the log mean and log standard deviation; these
 are
 then used to generate log-normal random numbers which then replace the MDL
 observations. From this, I apply the rkt package for seasonal Mann-Kendall
 trend test to determine.
 I know the NADA package has cenken, but that has limited covariate and
 seasonal capabilities, and I believe my approach below is more
 conservative.

 I'm sorry if the code is sloppy, but I'm pretty new to all this.. going on
 nearly 2 months now.
 I would really appreciate your feedback and ideas. I think my method is
 similar to lt;a
 href=quot;
 http://dugi-doc.udg.edu/bitstream/handle/10256/708/Olea.pdf?sequence=1quot
 ;Olea's,
 but I could never get ILR to cooperate with me.

 Thanks in advance.

 #example data creation - probably not the best way to do this...
 test-matrix(c(rep(1:7,each=20,1),rep(1:2,each=70),
  rep(rlnorm(10,-5.6,.2),2),
  rep(rlnorm(10,-5.6,.3)*1.2,2),
  rep(rlnorm(10,-5.6,.3)*1.22,2),
  rep(rlnorm(10,-5.6,.2)*1.25,2),
  rep(rlnorm(10,-5.6,.3)*1.3,2),
  rep(rlnorm(10,-5.6,.35)*1.4,2),
  rep(rlnorm(10,-5.6,.35)*1.4,2)),
  ncol=3)
 colnames(test)-c('Year',Season,Obs)

 test-as.data.frame(test)
 with(test,rkt(Year,Obs,Season,correct=T,rep='a'))
 with(test,plot(Obs~Year))

 #function below
 require(NADA)
 require(rkt)
 MDL_fix-function(value,detection,year,block,CV,month,plot_title)
   #Observed Value vector, Detection Limit Vector,Block Vector for seasonal
 MannKendall,covariate,month,Plot title
 {
   YearMonth=year+month/12
   #using NADA package, determine log-mean of and log-standard deviation of
 the observed valued
   #if(mean(value)1) {#determine if log or -log should be used (cenfit has
 issues with negatives)
   #lmeanvalue-{sum(summary(cenmle(value,valuedetection))[[9]][1:2])}
   #lsdvaluelt;-{summary(cenmle(value,valuelt;detection))[[6]]}
   #}
   #if(mean(value)=1) {#determine if log or -log should be used (cenfit has
 issues with negatives)
   if(length(which(valuedetection))/length(value).50){
 return(ERROR: TOO MANY POINTS BELOW MDL)}
   lmeanvalue-{(median(cenfit((log(value)),valuedetection))[[1]])}
 lsdvaluelt;-{(sd(cenfit(log(value),valuelt;detection)))[[1]]}
   #}
   if(length(which(valuelt;detection))/length(value).05){ #only pursue MDL
 fix if 5% of data censored, otherewise: do 1/2 MDL method
 #create matrix to fill with Mann-Kendall Output
 rkt_1-matrix(1,runs,12)
 colnames(rkt_1)- c('2-sided
 p-value','Score',Slope,Var(Score),'2-sided
 p-value_corrected','Var(Score)_corrected',
 'Partial Score', 'Partial 2-sided p-value',
 Partial
 Var(Score), Partial 2-sided p-value_corrected
 , 'Partial Var(Score)_corrected',Tau)
 for(i in seq(1:runs)) {
 {
   xj-1:length(value) #create random number vector
   MDL_fixed-1:length(value) #create final value vector
   for(j in seq(1:length(value))) {
 {
   if(value[j]detection[j]) #only pursue number generation if the observed
 value is below the detection limit
 {
 #create a random number (distributed log-normally based on lmean and
 lsd) that is below the detection limit
 repeat {
   xj[j]lt;-rlnorm(1,lmeanvalue,lsdvalue)
   if(xj[j]lt;detection[j]) {break}
 }
   }
 }
 MDL_fixed[j]lt;-ifelse(value[j]lt;detection[j],xj[j],value[j])
 #replace values below the detection limit with generated number
   }
 }
 if(!missing(CV)amp; !missing(block)) { #if covariate and block 

Re: [R-sig-eco] Assigning Rank to Temporal Data

2014-03-12 Thread Philippi, Tom
In order to glue tide height (or tide current) onto each fish position
observation:

1: Convert dates  times to POSIXct (do it all in GMT to avoid DST issues!).
2: Consider using NOAA tide data, available as 6 minute interval time
series http://tidesandcurrents.noaa.gov/api/ .  Luke Miller has functions
for scraping their website (
http://lukemiller.org/index.php/2011/02/accessing-noaa-tide-data-with-r/ )
or I have a function:
https://drive.google.com/file/d/0B-vmKFnoibOpVzRXNWxTZkthSm8/edit?usp=sharing
3: Convert dates to POSIXct, then data to ts.
4: Linearly interpolate the tide data to the times of your fish position.
5: glue tide data to fish data.
6:  Use cut() or more complex computations to make your categorical tide
status factor.

I hope that this helps...
Tom 2



On Tue, Mar 11, 2014 at 3:28 PM, Stacy Bierwagen sbier...@hawaii.eduwrote:

 Hello Listers,

 I am tracking home ranges of herbivorous reef fish and I have an extremely
 large data set that includes 300,000 time-stamped lat/long coordinates. I
 would like to assign a rank number to date/time in order to test position
 of fish versus tidal level. I only have only taken tide level 4x/day so I
 would like to create a function that ranks my temporal lat/long data as
 presence/absence (0/1) of high tide or low tide to see if this
 environmental factor significantly affects movement by using T-testing. I
 am not sure how I would go about making a loop or incrementing an i value
 to represent this in R. Should I first assign a numerical value to the
 date/time before I try to quantify this?

 I included headers of the data I am looking at below if this helps.

 Stacy Bierwagen

 TIDE DATA
 date time tidal height
 days since day 0 time as 24h fraction h since day 0
 days since day 0 tidal height
  6/1/2013 4:13 0.33
 0 0.17569 4.21667
 0.17569 0.33
  6/1/2013 11:05 1.31
 0 0.461805556 11.0833
 0.461805556 1.31
  6/1/2013 14:31 1.04
 0 0.60486 14.5167
 0.60486 1.04
  6/1/2013 21:25 1.66
 0 0.89236 21.4167
 0.89236 1.66
  6/2/2013 4:49 0.1

 TEMPORAL DATA:
Date TimeGMT DTHI DTConvertHI TConvertHI Frequency Signal Gain (dB)
 Latitude Longitude  6/14/2013 23:52:52.336 06142013 23:52:52.336 06142013
 13:52:52.336 13:52:52.000 63 86 0 21.43267 -157.791  6/14/2013
 23:52:50.295 06142013
 23:52:50.295 06142013 13:52:50.295 13:52:50.000 63 90 0 21.43267 -157.791
 6/14/2013 23:52:38.051 06142013 23:52:38.051 06142013 13:52:38.051
 13:52:38.000 63 88 0 21.43267 -157.791  6/14/2013 23:52:37.032 06142013
 23:52:37.032 06142013 13:52:37.032 13:52:37.000 63 86 0 21.43267 -157.791

 [[alternative HTML version deleted]]

 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology




-- 
---
Tom Philippi, Ph.D.
Quantitative Ecologist  Data Therapist
Inventory and Monitoring Program
National Park Service
tom_phili...@nps.gov
http://science.nature.nps.gov/im/monitor

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] function to extract climatic variables from shape file - error

2013-09-18 Thread Philippi, Tom
Maria--

I believe that your problem is with the line:
s1-readShapePoly(shapes[i], verbose=T)

It may be as simple as removing the quote marks:
s1-readShapePoly(shapes[i], verbose=T)
as shapes is a vector of character.  However, I use rgdal::readOGR() to
read shapefiles, so your mileage may vary, specifically on whether you need
to drop the file extension .shp via something like:
shapes - sub('\\.shp$','',shapes)

I hope that this helps.  If you have problems with the actual computations,
r-sig-geo might be a more useful list, but please do not cross post to both
lists.

Tom


On Wed, Sep 18, 2013 at 3:08 PM, Maria Fernanda Bonetti 
fernanda_bone...@hotmail.com wrote:

 Hi allI need to extract climatic variables of many shapefiles. I did a
 function to do this, but appears a error Error in getinfo.shape(filen) :
 Error opening SHP file.
 These are some shapes that I will need use (
 https://dl.dropboxusercontent.com/u/10809384/bios/shapes/teste.zip - just
 to use how example). The climatic variables I downloaded from Bioclim
 (.bil).
 This is the function:
 library(raster)library(rgdal)library(maptools)library (rgeos)library
 (gpclib)
 #listing the shapes in directoryshapes-grep(\\.shp$,
 list.files(recursive=T), value=T)len-length(shapes)
 # listing all .bil rastersbios-list.files(pattern='bil', full.names=T)
 biosT-stack(bios[1:5]) # Just testing with the first 5 rasters from Bioclim

 Temperatura-list()nomes-list ()for(i in 1:length(shapes)){
  s1-readShapePoly(shapes[i], verbose=T)
  nomes[[i]]-as.character(s1$SCI_NAME)  dados-extract(biosT, s1, df=T)
  media-sapply(dados, mean, na.rm=T)  minimo-sapply(dados, min, na.rm=T)
  maximo-sapply(dados, max, na.rm=T)  mediana-sapply(dados, median,
 na.rm=T)  desvio-sapply(dados, sd, na.rm=T)  soma-sapply(dados, sum,
 na.rm=T)  coef.var-sapply(dados, cv, na.rm=T)  results-cbind(media,
 desvio, minimo, maximo, mediana, coef.var, soma)
  Temperatura[[i]]-results}(here appears the error) Error in
 getinfo.shape(filen) : Error opening SHP file
 names(Temperatura)-nomes
 library(plyr)df-ldply(Temperatura,
 data.frame)df$bio-c(id,bio1,bio2,bio3,bio4,bio5,bio6,bio7,bio8,bio9,bio10,bio11)
 table (table(paste(teste$ID, teste$SCI_NAME, sep=)))


 Someone knows what is happening and how can I fix it?


 Maria Fernanda Bonetti

 [[alternative HTML version deleted]]

 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology




-- 
---
Tom Philippi, Ph.D.
Quantitative Ecologist  Data Therapist
Inventory and Monitoring Program
National Park Service
tom_phili...@nps.gov
http://science.nature.nps.gov/im/monitor

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] merge multi part polygons

2013-09-13 Thread Philippi, Tom
Maria--

I believe that your NAs from the extract are because those polygons do not
cover at least 1 center point of a pixel in your bioclim rasters.  One
solution is to use small=TRUE in your call to extract: see ?extract for
more information.  Another would be to pull the bioclim data with pixels
smaller than 2.5minutes (and still use small=TRUE).

Your other issues require more information about the scales of your
bioclime data and of the entire species ranges, the reason the species
ranges are multi-part polygons (e.g., many mountain tops, or a broad range
where they separately digitized water as holes and islands as small
included polygons), and most importantly, what question you are trying to
ask.

If you want an 'average' for each bioclim variable for each species, you
should make sure your extract specifies fun-'mean':
   v - extract(bios,spX,small=TRUE,fun='mean')
and then average across your polygons weighted by polygon area.  If a
substantial proportion of your polygons are smaller than ~50 pixels, you
may need to use weights=TRUE within the extract() call, especially if your
bioclim values vary greatly between adjacent pixels.  I don't know that
will work in combination with using extract() on the rasterBrick of
bioclim, as the result would be a 3 or 4 dimensional array instead of a
matrix.  If that happens, then you need to make your SpatialPolygons object
have a single Polygon in its polygons slot.  My copy of ASDAR (Bivand et
al.'s Applied Spatial Data Analysis with R) is AWOL, but I believe it shows
how to do this.

That said, in my work, such average values would be meaningless for the
questions I ask: I need to compute the ranges of each bioclim attribute for
a given species, and an envelope of the combinations of bioclim attribute
values occurring within the geographic range.  [I'm using bioclim
attributed generated for the US at 800m pixels based on PRISM normals, and
rangemaps of vertebrates.]

I hope that this points you in a useful direction,
Tom



On Fri, Sep 13, 2013 at 12:38 PM, Maria Fernanda Bonetti 
fernanda_bone...@hotmail.com wrote:

 Hi all




 I need to extract climatic variables of many shapefiles. I managed to do
 it, but doing one at a time (I have to do this for 3000 species).
 Well, I did the following:
 first I downloaded the rasters of BIOCLIM bios -getData (Worldclim var
 = bio, res = 2.5, T = download)
 Ok, then I went with shapes (2 species did just to test). teste_sp1
 -readShapePoly (sp1, IDVAR = NULL, proj4string = CRS (as.character
 (NA)), verbose = FALSE, repair = FALSE, FALSE = force_ring, delete_null_obj
 = FALSE, retrieve_ABS_null = FALSE)
 Finally, I used a function to extract the variables of shapes v -
 extract (bios, sp1)

 THE PROBLEM:The sp1 is a multi part polygon (23 features). I could make
 the average of these 23 lines and everything would be ok right? The problem
 is that some lines appears NA, and I have no idea why, and I can not just
 ignore that.
 Do I need to join these parts into a single polygon? I tried this with
 unionSpatialPolygons but another error appears: Error in
 unionSpatialPolygons (sp1, IDS, threshold = NULL,: input lengths differ
 Can anyone help me?

 Att,
 Maria Fernanda BonettiDoutoranda do Programa de Pós Graduação em Ecologia
 e Conservação - UFPRMestre em Ecologia e ConservaçãoPesquisadora do
 Instituto de Pesquisas Cananéia - IPeC




 [[alternative HTML version deleted]]


 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology




-- 
---
Tom Philippi, Ph.D.
Quantitative Ecologist  Data Therapist
Inventory and Monitoring Program
National Park Service
tom_phili...@nps.gov
http://science.nature.nps.gov/im/monitor

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Any R packages equivalent to Fragstats?

2013-04-17 Thread Philippi, Tom
I would start with SDMTools


On Wed, Apr 17, 2013 at 10:22 AM, Bruce Miller batsnc...@gmail.com wrote:

 Hi all,

 Any/*R*/ packages that include the equivalent of FRAGSTATS for
 fragmentation indices?
 Tnx

 Bruce

 [[alternative HTML version deleted]]

 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology




-- 
---
Tom Philippi, Ph.D.
Quantitative Ecologist  Data Therapist
Inventory and Monitoring Program
National Park Service
(619) 523-4576
tom_phili...@nps.gov
http://science.nature.nps.gov/im/monitor

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] quantifying directed dependence of environmental factors

2013-03-07 Thread Philippi, Tom
Jay--

I'm not sure how one would combine SEM / graphical models with
compositional dissimilarity as a response.  You might be able to fit a
series of models in adonis() or capscale(), comparing just direct factors
to direct + intermediate, etc..  I don't have any good ideas on how you
might test more complex causal structures.

Given that you are dealing with diatoms across space (with environmental
measurements) and down time (in cores, often without environmental
measures), there may be an alternate approach possible based on calibration
approaches to inferred environments (e.g., WACAL) or modern analogs.  I
would look at packages bio.infer, paltran, fossil, and analogue, and search
to see if anyone has pushed them in the direction you want to go.

Tom



On Thu, Mar 7, 2013 at 6:50 AM, Jay Kerns gjkerns...@gmail.com wrote:

 Dear Sarah,

 On Thu, Mar 7, 2013 at 9:32 AM, Sarah Goslee sarah.gos...@gmail.com
 wrote:
  That sounds like a job for path analysis or for structural equation
  modeling, depending on the level of sophistication desired and the
  hypotheses to be tested.
 

 *Yes!*  I said almost the exact same thing (I didn't say anything
 about Path Analysis because I don't know much about it), but I had it
 in my mind that SEM was targeted more to sociological things and
 didn't know if/that it was common in ecological contexts.  Anyway,
 it's nice to hear that word coming from somebody else.

  There are plenty of good resources for both, in and out of R.

 Indeed.  I have some work to do.  Thank you.

 --
 Jay



  Sarah
 
 
  On Wednesday, March 6, 2013, Jay Kerns wrote:
 
  Hello,
 

 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology




-- 
---
Tom Philippi, Ph.D.
Quantitative Ecologist  Data Therapist
Inventory and Monitoring Program
National Park Service
(619) 523-4576
tom_phili...@nps.gov
http://science.nature.nps.gov/im/monitor

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Zuur / Pinierho / Faraway

2012-11-30 Thread Philippi, Tom
At least some errata at:
http://www.maths.bath.ac.uk/~jjf23/ELM/
http://www.maths.bath.ac.uk/~jjf23/ELM/errata.html

Tom 2


On Fri, Nov 30, 2012 at 10:47 AM, Dave Roberts 
dvr...@ecology.msu.montana.edu wrote:

 Philip,

IS there an online errata,or do you just have to be smart and diligent?

 Thanks, Dave

 On 11/29/2012 07:30 AM, Dixon, Philip M [STAT] wrote:

 I agree with all the previous comments and second Tom's recommendations
 of Faraway as an 'in between' Zuur and Piniehro  Bates.  One thing to be
 careful of: While the advice in Faraway is sound, there are more than a few
 mistakes in his equations.


 Philip Dixon

 __**_
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/**listinfo/r-sig-ecologyhttps://stat.ethz.ch/mailman/listinfo/r-sig-ecology


 --
 ~~**~~**
 
 David W. Roberts office 406-994-4548
 Professor and Head  FAX 406-994-3190
 Department of Ecology email drobe...@montana.edu
 Montana State University
 Bozeman, MT 59717-3460


 __**_
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/**listinfo/r-sig-ecologyhttps://stat.ethz.ch/mailman/listinfo/r-sig-ecology




-- 
---
Tom Philippi, Ph.D.
Quantitative Ecologist  Data Therapist
Inventory and Monitoring Program
National Park Service
(619) 523-4576
tom_phili...@nps.gov
http://science.nature.nps.gov/im/monitor

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Zuur or Pinheiro Bates

2012-11-28 Thread Philippi, Tom
I would also recommend considering Faraway (2006) Extending the linear
model with R
and the draft chapters of Doug bates' lme4 book:
http://lme4.r-forge.r-project.org/book/
The choice between Pinheiro  Bates, Zuur, and Faraway is part style and
part specific applications of mixed models: I need generalized linear mixed
models for longitudinal data, so I find Faraway more useful.



On Wed, Nov 28, 2012 at 10:17 AM, Martin Weiser weis...@natur.cuni.czwrote:

 Dear ecology fellows,
 I need to work with mixed-effects models in more detailed way than what
 M. Crawley described in the R-book. This book suited me well, but I feel
 I need to go further in this topic. I am able to buy one book.
 Would you recommend:
 a, Zuur et al. (2009): Mixed Effects Models and Extensions in Ecology
 with R
 XOR
 b, Pinheiro  Bates (2000/2009): Mixed-Effects Models in S and S-PLUS?

 Thank you for your patience.
 With the best regards,
 Martin Weiser
 (Dept. of Botany, Science Faculty, Charles University in Prague, Czech
 Rep.)

 ___
 R-sig-ecology mailing list
 R-sig-ecology@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-sig-ecology




-- 
---
Tom Philippi
Quantitative Ecologist  Data Therapist
Inventory and Monitoring Program
National Park Service
tom_phili...@nps.gov
http://science.nature.nps.gov/im/monitor

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology