Re: [R-sig-Geo] kriging [SEC=UNCLASSIFIED]

Jin.Li Sun, 06 Jul 2008 19:13:56 -0700

Dear Tom,

Many thanks for your comments and the relevant information. It seems that we
need to clarify what is RK. The definition of RK is, however, not quite clear
and debatable. Because of this, I did not give a definition of RK in my
review. The list of RK contains those methods that have been called RK
previously. In my review it was implied that RK was the combination of two
components: a regression model and a kriging method, e.g. OLS regression + OK
of residuals. Each of KED, UK and REML_EBLUP was treated as an independent
method in my review. The key difference was that in RK the prediction was the
sum of these two components that were calculated separately, but in other
methods, these two components were not separately calculated. I agree with
you that they all could be treated as versions of one generic approach, but
we need to consider such difference. Hope I have made myself clear here and
also hope my list was not too misleading. Given that this review is now under
peer review, I am still able to make any corrections and to include new
references if you have any relevant ones.  

If the listed methods are correct, we then should follow a statistically
sound procedure as you suggested. Several methods have been applied, e.g.,
OLS, GLS, GAM and GLM. The next question perhaps is how to choose appropriate
method for the data to be modelled and how to judge whether the method
selected is appropriate. Obviously, GLS is one possible candidate method.
Since there are many data types, like count data (bird abundance in E.J.
Pebesma, R.N.M. Duin, P.A. Burrough 2005), and percentage data (e.g. sediment
content data I am going to model), care must be taken in selecting an
appropriate regression model. Any further discussions on this topic could
benefit everyone interested in RK or the like.

Best regards,

Jin

-----Original Message-----
From: Hengl, T. [mailto:[EMAIL PROTECTED] 
Sent: Friday, 4 July 2008 5:55
To: Li Jin
Cc: [email protected]
Subject: RE: [R-sig-Geo] kriging [SEC=UNCLASSIFIED]

Dear Jin,

I really think that this list of yours is misleading. I agree, your
references are correct and there are indeed computational differences,
however, one should follow a statistically sound procedure (this includes
e.g. GLS estimation of the regression coefficients, use of proper
transformation). So the methods you mention (including KED and UK) are just
versions of one generic technique. In fact, I would even call ordinary
kriging and regression estimation only a special case of RK (terms UK and KED
are equally valid). See also Fig. 2.3: "Decision tree for selecting a
suitable spatial prediction model" in my lecture notes.

In practice, even if you use the most sophisticated RK method, the result
might not be much different from using the most simple version (OLS
regression + OK of residuals). This has been nicely demonstrated by Minasny
and McBratney (2007):

Minasny, B., McBratney, A. B., 2007. Spatial prediction of soil properties
using EBLUP with Mat´ern
covariance function. Geoderma 140: 324-336.
http://dx.doi.org/10.1016/j.geoderma.2007.04.028

The true alternatives to RK are machine learning and BME-type of techniques,
because there are fundamental differences between these. I just came back
from the ICCSA conference in Perugia where I met Mikhail Kanevski who is
promoting GRNN (see http://www.springerlink.com/content/y5j566015784188p),
which is again fundamentally different than RK (I think they use a kind of
k-NN algorithm to incorporate spatial auto-correlation). Maybe this is the
list that you should be making (see also the SIC exercise ---
http://www.ai-geostats.org/index.php?id=44).

Going back to Edzer's questions - NO, we are not (yet) working on
implementing a package that allows much large families of statistical models
(regression trees, GAMs, multinomial regression etc) to be integrated with
geostatistical prediction, including methods for local RK, model-based
sampling and sampling optimisation for RK-type of models (but I am
increasingly thinking about it), and YES, I strongly believe that
implementing regression and kriging estimation separately has many benefits
(as opposed to KED algorithm implemented in gstat and many other packages).
Anybody interested in these topics should take a look at sections 2.8 "Final
notes about regression-kriging" and "2.2 Local versus localized models" in my
lecture notes (I like to refer to it because it is an open-access material).

T. Hengl
http://spatial-analyst.net

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Sent: Fri 7/4/2008 2:18 AM
To: [EMAIL PROTECTED]; Hengl, T.
Cc: [email protected]; [EMAIL PROTECTED]
Subject: RE: [R-sig-Geo] kriging [SEC=UNCLASSIFIED]

Hi All,

I have recently reviewed the spatial interpolation methods for environmental
scientists. Regression kriging (RK) is one of over 40 methods reviewed. Here
attached is what I described in the draft of my review. Obviously, RK
described in the review is different from Edzer's approach. Any comment?
Thanks.
Cheers,

Jin

--------------------------------------------
Jin Li, PhD
Spatial Modeller/
Computational Statistician
Marine & Coastal Environment
Geoscience Australia

Ph: 61 (02) 6249 9899
Fax: 61 (02) 6249 9956
email: [EMAIL PROTECTED]
--------------------------------------------

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Edzer Pebesma
Sent: Thursday, 3 July 2008 9:33
To: Hengl, T.
Cc: [email protected]; Dave Depew
Subject: Re: [R-sig-Geo] kriging

Hengl, T. wrote:
> I agree with Paulo - gstat can work with any linear model including the
transforms of the original predictors e.g.:
>
> Z ~ X + X^2 + Y + Y^2    etc.
>
> The problem is that gstat implements the so-called
Kriging-with-external-trend algorithm to make predictions (see section 2.1 of
my lecture notes), which is mathematically more elegant, but then it accepts
only a family of linear models (and not GLMs, regreesion-trees etc.). I have
been promoting the concept of regression-kriging (deterministic and
stochastic predictions seperated), but we still did not implement it in any
package so far.
>  
And I can see why, as there are quite a few problems still to solve
(afaik) ahead of you. When you cut the problem in two, do the regression
estimation and residual prediction in two separate processes (often
under different assumptions, e.g. wrt spatial correlation) you ignore
the correlation between the two. Finding a prediction variance by
naively adding the variances of the two components e.g. does not yield
zero variance at observation locations, because a non-zero correlation
is ignored. At other locations, this correlation is also non-zero.
Furthermore, if you cut the problem in two for e.g. binomial or Poisson
distributed cases, in this approach you likely end up with negative
predictions or predictions above one for the binomial case.

Does the paper you refer to (by yourself) give solutions to these two
problems?
> You can at any time separate the predictions (e.g. krige only the
residuals), but then gstat will not give you the regression-kriging variance,
and you can not run geostatistical simulations.
>  
No, of course not, for the reasons mentioned above. The gstat approach
is: if you want to make a mess, please take responsibility for it by
yourself (and don't blame me--through the package). There is a paper I
did it with count data, though, which is

E.J. Pebesma, R.N.M. Duin, P.A. Burrough, 2005. Mapping Sea Bird
Densities over the North Sea: Spatially Aggregated Estimates and
Temporal Changes. Environmetrics 16
<http://www3.interscience.wiley.com/cgi-bin/jissue/110577560>, (6), p
573-587 <http://dx.doi.org/10.1002/env.723>.

and (part of) the analysis is found in

library(gstat)
demo(fulmar)

I'm also confused by this term "regression kriging". Would you claim
that the universal kriging/kriging with (one or more) external drifts
implemented by gstat is not regression kriging? Are you actually working
on a package that does do regression kriging as you define it?
--
Edzer

> see also:
> https://stat.ethz.ch/pipermail/r-sig-geo/2008-February/003174.html
>
>
> All the best,
>
> Tom Hengl
> http://spatial-analyst.net
>
> Hengl, T., 2007. A Practical Guide to Geostatistical Mapping of
> Environmental Variables. EUR 22904 EN Scientific and Technical Research
> series, Office for Official Publications of the European Communities,
> Luxemburg, 143 pp.
> http://bookshop.europa.eu/uri?target=EUB:NOTICE:LBNA22904:EN:HTML
>
>
> -----Original Message-----
> From: [EMAIL PROTECTED] on behalf of Dave Depew
> Sent: Mon 6/16/2008 10:54 PM
> To: Paulo Justiniano Ribeiro Jr
> Cc: [email protected]
> Subject: Re: [R-sig-Geo] kriging
> 
> Ok,
> What about higher order polynomials? I have fitted one using a gam to
> the data which which helps to normalize the residuals, and reduce the
> variance of the residuals.
> Is it simply a matter of plugging in the function into the gstat command
> line? Or is it simpler to krig the residuals and then add the trend back
> to the interpolated residual grid?
>
>
> Paulo Justiniano Ribeiro Jr wrote:
>  
>> Dave,
>>
>> what is necessary for UK is a relation expressed by a linear model, not
>> necessaraly a linear relation between the variables.
>> e.g. you could have a second degree polinomial and still work within the
>> scope of universal kriging.
>>
>>
>> On Mon, 16 Jun 2008, Dave Depew wrote:
>>
>>  
>>    
>>> Hi all,
>>> I have a data set that I would like to krige to interpolate between
>>> transects. There is a non-linear trend between two of the variables...my
>>> impression from reading the gstat help file is that there must be a
>>> linear relationship between the data to use universal kriging?
>>> Second, would a method of non-linear regression followed by modelling of
>>> the residuals with a semivariogram be an appropriate solution?
>>>
>>> Thanks,
>>>
>>> Dave
>>>
>>> _______________________________________________
>>> R-sig-Geo mailing list
>>> [email protected]
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>
>>>    
>>>      
>> Paulo Justiniano Ribeiro Jr
>> LEG (Laboratorio de Estatistica e Geoinformacao)
>> Universidade Federal do Parana
>> Caixa Postal 19.081
>> CEP 81.531-990
>> Curitiba, PR  -  Brasil
>> Tel: (+55) 41 3361 3573
>> Fax: (+55) 41 3361 3141
>> e-mail: paulojus AT  ufpr  br
>> http://www.leg.ufpr.br/~paulojus
>>
>>
>>
>>
>>    
>
> _______________________________________________
> R-sig-Geo mailing list
> [email protected]
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
>
>
>       [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> [email protected]
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>  

--
Edzer Pebesma
Institute for Geoinformatics (IfGI)
University of Münster
http://ifgi.uni-muenster.de/

        [[alternative HTML version deleted]]

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/r-sig-geo

Re: [R-sig-Geo] kriging [SEC=UNCLASSIFIED]

Reply via email to