Re: [R-sig-Geo] Creating density heatmaps for geographical data

jeremy.raw Tue, 19 Oct 2010 06:53:43 -0700

I'll offer my two cents as someone who has been using spatstat a lot recently 
to do spatial point process analysis.

I think it's an entirely useful when organizing spatial analysis libraries to 
look at the sp package as a master package for getting spatial data in and out 
of R and for performing basic metadata operations on such data (so one can 
simplify a lot by using OGR/GDAL or projection functionality through this 
interface).

Packages like spatstat (or raster, or whatever) might include some shortcuts to 
directly import common data from the world at large, but they are also armed 
with functions to convert in and out of sp data types (so the burden of 
wide-ranging input/output formats and geospatial projection is carried through 
sp).  The main need for data conversion functions in packages like spatstat is 
to take data in forms the user is likely to have and convert them into a 
structure that permits efficient application of the algorithms to which the 
package is dedicated (spatial point process, raster evaluation, etc.).  So I 
have nothing wrong with spatstat working with its own specialized types that 
can be built out of more general spatial data.  In fact, I even think it's 
rather nice to do that, because it helps one form an effective mental map of 
what you're doing and thus avoid some basic errors (e.g. trying to get 
meaningful spatial point process analyses when you haven't identified a window).

For novices trying to get their minds around what is going on with R spatial 
packages, that observation suggests the useful strategy of first trying to 
understand each package in its own terms (why was it created, what is it 
supposed to do, what functions does it provide, and what are the requirements 
of the data structures on which those functions operate), then trying to match 
one's own problem to the package functionality, and finally gluing the data and 
operations together with a minimal set of conversions.  Naturally, that's an 
iterative process in practice.  But that approach has helped me avoid the 
perilous shortcut of simply trying to apply function X from package P to my 
data D (which inevitably means that I impose my own fantasy of what X should 
do, and then bog down dealing with "bugs" that emerge directly from my own lack 
of understanding of how the problem, the tools, and the data properly fit 
together).

>From a package interface standpoint, I think a significant source of confusion 
>for newcomers is the fact that you can do (for example) raster calculations on 
>lots of different types of objects, using different functions, with different 
>interfaces:  an im object in spatstat, for example, would use 'eval.im', but 
>raster provides the 'calc' function, and sp lets you do basic data-frame 
>computations transparently.  And of course, there's the fact that the standard 
>plotting facilities for each of these packages provide different defaults and 
>capabilities (which for me is the biggest headache -- I end up doing a lot of 
>data conversions just so I don't have to remember all the alternative options 
>and defaults).

This gets to Barry's point about spatstat (or almost any well-developed spatial 
package) providing a lot of functionality that supports the primary mission, 
but that also has more general applicability.  For users who specialize in 
spatial point process analysis, it's very nice to have a complete set of tools 
on hand in spatstat so you don't have to go hunting around.  But the pitfall is 
that if you add a lot of "extras", it encourages users to expect that the 
"perilous shortcut" I described above is available to them (that they can just 
throw random data at a function they only half understand and magically get the 
answer they are looking for).

In the long run, it would probably be helpful (perhaps in sp or some supporting 
package such as maptools or rgeos) to "establish" some generic functions for 
additional operations such as raster calculations that can be specialized in a 
package around a common interface so that common geospatial functions can 
easily be perceived as common.  But the example of what has happened with the 
spplot output functions in sp (specifically, that specializations are not 
widely available in other packages) is perhaps a cautionary tale about how such 
an approach is likely to work in practice.  I suspect this kind of anarchy 
reflects both the strength and weakness of distributed package development by 
unrelated teams:  each team does what they need (including supporting tools) 
with greater or lesser attention to what might be available in the larger 
environment or other packages.  Teams and users then end up working with 
whatever version of the overlapping functionality that they find easiest and 
most comprehensible.  That's good, because such "competition" and selection 
helps drive innovation as we all collectively contribute to figuring out what 
works best.  But it's also bad, especially for novices who end up writing 
strange mash-ups of data conversions, performing the same operation here and 
there with partially equivalent functions from several different packages, and 
struggling with general confusion about what's going on and why.

Notwithstanding the challenges, I am deeply grateful to all the R spatial 
developers who have put together this amazing set of useful resources...

Jeremy Raw, P.E., AICP
FHWA Office of Planning
jeremy....@dot.gov
(202) 366-0986

-----Original Message-----
From: r-sig-geo-boun...@stat.math.ethz.ch 
[mailto:r-sig-geo-boun...@stat.math.ethz.ch] On Behalf Of Barry Rowlingson
Sent: Tuesday, October 19, 2010 7:54 AM
To: Karl Ove Hufthammer
Cc: r-sig-geo@stat.math.ethz.ch
Subject: Re: [R-sig-Geo] Creating density heatmaps for geographical data

On Tue, Oct 19, 2010 at 11:58 AM, Karl Ove Hufthammer <k...@huftis.org> wrote:

> And though the 'window' element of 'ppp' objects may be of use to some
> people, I haven't had any use for it. The annoying thing here is that the
> constructor doesn't generate the window automatically, based on the extent /
> bounding box of the data, and don't have an *option* for doing this, either.
> Whenever I have used 'spatstat' (not too often), I have had to spend too
> much time looking up how the window should be specified. Having [0,1] ×
> [0,1] as the *default* window, and excluding any points outside this does
> seems like a strange design decision.

 I had this 'argument' with Rolf and Adrian a few years ago during a
very nice stay in Perth with them. spatstat is not about (geo)spatial
data - it's about statistical point pattern analysis. A statistical
point pattern is only well-defined when there's a window. Otherwise it
aint a point pattern. And spatstat doesn't have any business with
non-spatial point patterns! :)

 There's a lot of functionality in spatstat that people want to use in
other contexts, such as some of the transformations or window
manipulation functions, and I think these could be usefully taken out
and put into a package that works with sp-class objects.

 But ppp objects are perfectly understandable and sensible if all you
do is point pattern analysis!

Barry

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-sig-geo

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo@stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-sig-geo

Re: [R-sig-Geo] Creating density heatmaps for geographical data

Reply via email to