Gerald van den Boogaart wrote:
...
Than one could either rely on Bayesian Kriging using the inferred possible
range of parameters and the typical variogram shape as a prior or check the
reliability of the various automated fitting procedures (which hopefully the
others in the list will supply) against that old datasets.
Thanks for the invitation Gerald ;-)
The automatic fitting routine I use a lot is where a variogram model is
fitted to sample (experimental) variogram values using weighted least
squares fitting. Let the variogram consist of a series of g*(h) values
for (averaged) lag values h; the variogram model is g(h) and the number
of point pairs to estimate g*(h) is N(h). The weights typically depend
on g(h) or g*(h) and/or N(h). The question now is how to weight. Here
are some experiences/opinions.
When you use weights N(h), you tend to give too much weight to fitting
g(h) for larger h, where you have typically much more point pairs than
for short distances. Will often lead to underestimation of spatial
correlation (too large nuggets) for that reason.
When you use weights N(h)/(g(h)^2) (I believe according to the original
Cressie** proposal), the problem is that the weights vary from iteration
to iteration. This may lead to instability, oscillation, or to "best"
fits with a larger weighted RSS than in the last iteration. How to
choose? If you fix the weights, the final fit depends on the initial
g(h) values chosen.
If you use N(h)/(g*(h)^2), weights may become infinite when g*(h) is
zero (think of indicator variables).
If you use N(h)/(h^2) as weight, you do acknowledge that smaller
distances get larger weight, and larger N(h) results in more weight, but
relieve some of the problems mentioned above. It is equivalent to WLS
fit following Cressie** when using a linear variogram without nugget
(and infinite range). This is the default that the fit.variogram
function in R package gstat uses (this is not meant as advertising, only
as information). The original idea for this was published by (I believe)
Zhang et al. in Computer and Geosciences paper***. A disadvantage is if
you have a g*(h) with h very close to or equal to zero (near or exact
duplicate observation locations); it will receive (almost) infinite weight.
All of this of course ignores the correlations between g*(h1) and g*(h2)
when h2>h1, present because they may share points in the point pairs
formed. If this were addressed we would be using generalized least
squares fit. I believe that Ronald Christensen (or was it Peter
Kitanidis) once explained that fitting a model to the variogram cloud
(with all N(h)=1) using all correlations between g*(h1) and g*(h2) is
equivalent to the (restricted) maximum likelihood fit in some Math Geol
paper in the 90s. I forgot which.
A long story to admit that I do not have a simple answer. What did I
forget?
Does anyone know which defaults are chosen in other software that
"helps" us fit variograms using WLS?
--
Edzer
** N. Cressie, Fitting variogram models by weighted least squares.
/ Mathematical Geology/, 17(5):563-586, 1985
*** On the weighted least-squares method for fitting a semivariogram model
/Computers & Geosciences/, /Volume 21, Issue 4/, /May 1995/, /Pages 605-608
/X. F. Zhang, J. C. H. Van Eijkeren and A. W. Heemink
+
+ To post a message to the list, send it to [email protected]
+ To unsubscribe, send email to majordomo@ jrc.it with no subject and "unsubscribe
ai-geostats" in the message body. DO NOT SEND Subscribe/Unsubscribe requests to the
list
+ As a general service to list users, please remember to post a summary of any
useful responses to your questions.
+ Support to the forum can be found at http://www.ai-geostats.org/