RE: AI-GEOSTATS: moving averages and trend

2010-02-02 Thread Cornford, Dan
Sebastiano,

  I am struggling to understand why you are interested in doing trend + 
residual separation? There can be no unique decomposition of a data set into 
'trend' and 'residual', it is a judgement about what model you feel is most 
appropriate given your prior beliefs and observations (evidence). The only 
thing you can say about the model is to validate it on out of sample data (even 
as a Bayesian I say this!). So in a sense there is no correct decomposition, 
and any decomposition is valid (so long as it is correctly implemented - maybe 
that is your question?). Are some decompositions better than others? Well yes 
they are likely to be, but this largely depends on your data (and the 
completeness of the overall model).

In terms of your original question about the shape of the kernel there is no 
overall theory that I am aware of - different kernels will have different 
properties in terms of the function classes that they represent (e.g. 
differentiability, frequency response / characteristic length scales). Kernel 
families will have different null spaces which might or might not be important 
for your specific application and what you want to find out.

I'm not sure if this is terribly helpful ... but I think it is the reality - 
everything depends on your data and your judgement (prior). Conditional on 
those you get a model and you need to validate this model carefully ... then 
you are OK.

cheers

Dan
---
Dr Dan Cornford
Senior Lecturer, Computer Science and NCRG
Aston University, Birmingham B4 7ET

www: http://wiki.aston.ac.uk/DanCornford/

tel: +44 (0)121 204 3451
mob: 07766344953
---

From: owner-ai-geost...@jrc.ec.europa.eu 
[mailto:owner-ai-geost...@jrc.ec.europa.eu] On Behalf Of seba
Sent: 02 February 2010 08:39
To: Pierre Goovaerts
Cc: ai-geostats@jrc.it
Subject: Re: AI-GEOSTATS: moving averages and trend

Hi Pierre

I think that for my task factorial kriging is a little bit
too much sophisticated (nevertheless, is there any open source or
free implementation of it ??? I remember that it is implemented in Isatis.).

I have an exhaustive and regularly spaced data set (i.e. a grid) and I need
to calculate locally the spatial variability of the residual surface or better
I would like to calculate the spatial variability of the high frequency 
component.
Here I'm lucky because I know exactly what I want to see and what I need to 
filter out.
In theory, using (overlapping) moving window averages (but here it seems better 
to use some more complex kernel)
one should be able to filter out the short range variability (characterized by 
an eventual  variogram range within the window size???).
Seeing the problem from another perspective, in the case of a perfect
sine wave behavior, I should be able to filter out spatial
variability components with wave lengths up to the window size.
But maybe there is something flawed in my reasoningso feedback is 
appreciated!
Bye
Sebastiano




At 16.27 01/02/2010, you wrote:

well Factorial Kriging Analysis allows you to tailor the filtering weights
to the spatial patterns in your data. You can use the same filter size but
different kriging weights depending on whether you want to estimate
the local or regional scales of variability.

Pierre

2010/2/1 seba  
sebastiano.trevis...@libero.itmailto:sebastiano.trevis...@libero.it
Hi José
Thank you for the interesting references. I'm going to give a look!
Bye
Sebastiano


At 15.46 01/02/2010, José M. Blanco Moreno wrote:

Hello again,
I am not a mathematician, so I never worried too much on the theoretical 
reasons. You may be able to find some discussion on this subject in Eubank, 
R.L. 1999. Nonparametric Regression and Spline Smoothing, 2a ed. M. Dekker, New 
York.
You may be also interested on searching information in and related to (perhaps 
citing) this work: Altman, N. 1990. Kernel smoothing of data with correlated 
errors. Journal of the American Statistical Association, 85: 749-759.
En/na seba ha escrit:

Hi José
Thank you for your reply.
Effectively I'm trying to figure out the theoretical reasons for their use.
Bye
Sebas




--
Pierre Goovaerts

Chief Scientist at BioMedware Inc.
3526 W Liberty, Suite 100
Ann Arbor, MI  48103
Voice: (734) 913-1098 (ext. 202)
Fax: (734) 913-2201

Courtesy Associate Professor, University of Florida
Associate Editor, Mathematical Geosciences
Geostatistician, Computer Sciences Corporation
President, PGeostat LLC
710 Ridgemont Lane
Ann Arbor, MI 48103
Voice: (734) 668-9900
Fax: (734) 668-7788

http://goovaerts.pierre.googlepages.com/


RE: AI-GEOSTATS: Interpolation of measures with measurement errors

2009-10-06 Thread Cornford, Dan
Enrico,

  sorry we have caused some problems / confusion. I will reply inline below:

From: Enrico Guastaldi [mailto:enrico.guasta...@gmail.com] 
Sent: 05 October 2009 17:04
To: Cornford, Dan
Cc: ai-geostats@jrc.it; r-sig-...@stat.math.ethz.ch
Subject: Re: AI-GEOSTATS: Interpolation of measures with measurement errors

Dear Dan and dear lists members,
I'm trying to explain my problem in two steps: the first is the theory, the 
second one the practical application on my case study.

1)
Actually my errors are not so gaussian, however I think I could consider them 
gaussian like.
It seems I've to set not zero the values of diagonal covariance matrix (used 
for the variogram). Maybe I've to put in these matrix the measurements errors 
instead zero. However, these are not actual variances, but ranges deriving from 
the instrumental error. Is it possible to assume it as confidence interval and 
use it such as the diagonal of covariance matrix?
This in theory.

DC: I think this highlights that when people specify errors they should be as 
precise as possible (this is what UncertML is designed to try and help people 
do). Giving a range is really only useful with a precise definition of what the 
min and max values mean (5th and 95th percentiles?). In a Bayesian approach to 
the problem, which I prefer, one is required to specify a probability 
distribution (two percentiles alone are not enough). This can be quite a 
challenge, but the other imprecise probability models while often being rather 
attractive on the surface can lack the ability to undertake complex analysis 
and generally can only make weaker statements (note I don't want to start a 
debate here about Bayesian versus other subjective / imprecise uncertainty 
frameworks!). 

DC: Also in theory yes, I might want to put observation errors on the diagonal 
of the covariance matrix, but I might also want to allow an additional nugget 
effect to model unresolved variation (i.e. things happening below the 
measurement separation distance) - I would want to estimate the nugget effect, 
but fix the known observation errors - we do this in the psgp code.

DC: On the issue of Gaussian errors, well I think you are going to need to make 
some distributional assumption here and the Gaussian one will make life easier. 
My experience is that if your underlying distribution is symmetric then your 
mean predictions will not be overly sensitive to the exact distributional form 
(i.e. small deviations from Gaussian), but if the distribution is skewed 
significantly then this might be more of an issue. This is in practice I must 
say. I would also emphasise here we are discussing the errors on the 
observations, not the observations themselves!

But in practice???

2)
In practice, since I'm not a programmer, I've tried to use the psgp R package 
in order to perform the interpolation of my variable with the associated errors.
I've tried to use the Intamap on line service, but I've got some problem with 
the net traffic (maybe) and I did not achieve the result.
So, I installed psgp and intamap in my computer, in order to perform the 
calculation locally in my machine.

First ten rows of my dataset are as follows (names of variables:X, Y, V2_PPM, 
ERR_V2_PPM:)

715946.900,4826440.340,2.280,0.140
722818.590,4824910.500,2.820,0.140
725514.920,4815239.460,2.380,0.130
722793.930,4810022.240,3.160,0.150
717682.540,4811456.540,3.040,0.140
712376.620,4806677.870,2.730,0.150
716270.140,4801958.660,2.650,0.140
721068.720,4801447.860,2.990,0.150
718812.980,4792920.780,4.450,0.170
722315.960,4788258.960,2.190,0.130
...

Actually I'm not uderstanding how process these data to interpolate my variable 
V2_PPM.
The EPSG code il 23032.
Then, I think I've to set up my object as Intamap object  as follows (is it 
correct?):

library(psgp)
coordinates(rock) = ~X+Y
data(grid.enrico)
gridded(grid.enrico) = ~x+y
proj4string(rock) = CRS(+init=epsg:23032)
proj4string(grid.enrico) = CRS(+init=epsg:23032)
# set up intamap object:
obj = createIntamapObject(
    observations = meuse,
    predictionLocations = grid.enrico,
    targetCRS = +init=epsg:23032,
    class = psgp


However, I did not understand where I can declare the variable containing the 
measurement errors.

DC: I am not a great R user I am afraid (I probably should not be sending this 
to the R sig list!) but Jon / Remi might be able to provide more advice on how 
to use psgp's from R in practice. I know the WPS rather better!

DC: How big is your data set? If you send it to me, I'd be very interested to 
try with the web service since this is part of our project outcomes!!

Moreover, I do not understand how prosecute the process (i.e. experimental 
variogram, variogram modelling, and finally kriging)

DC: I can say more about how the psgp method works. This is a maximum 
(marginal) likelihood inference based method, so there is no experimental 
variogram modelling, rather a fixed covariance function (in the present

RE: AI-GEOSTATS: Interpolation of measures with measurement errors

2009-09-28 Thread Cornford, Dan
Enrico,

  we have built an online system to perform this as part of the INTAMAP project 
- you can try this here:

http://intamap.geo.uu.nl/~jon/intamap/tryIntamapj.php

If you paste in observations with Gaussian errors (I assume the +/- means one 
or two standard deviations - I would check this!) in the form x, y, value, 
stddev then our interpolation method (called psgp, which will shortly be 
released as an R and C++ library too) will provide a prediction of the mean and 
variance using a maximum likelihood Gaussian process method.

The interface on that web page should allow you to try out the system very 
simply and the associated web site has details for more interactive ways of 
using the service, or installing the system on your own machine.

If you want to have a quick look I suggest using the OMI NO2 data set which 
contains error estimates (this could take a little bit of time depending on the 
usage of the service!). Note the visualisation is still a little beta, so I 
would not entirely trust the legends!

Further details will be added to the web site in the next few weeks!

cheers

Dan

---
Dr Dan Cornford
Senior Lecturer, Computer Science and NCRG
Aston University, Birmingham B4 7ET

www: http://wiki.aston.ac.uk/DanCornford/

tel: +44 (0)121 204 3451
mob: 07766344953
---

From: owner-ai-geost...@jrc.it [mailto:owner-ai-geost...@jrc.it] On Behalf Of 
Enrico Guastaldi
Sent: 28 September 2009 13:55
To: ai-geostats@jrc.it
Subject: AI-GEOSTATS: Interpolation of measures with measurement errors

Dear list members,
I'm looking for some kind of interpolation for values of an environmental 
variable which has been measured together the measurement errors, for instance 
a measure is 45ppm +or- 10.7ppm, another one is 10ppm +or- 3ppm, and so on. In 
practice, measures and measurement errors are two independent variable.
I could use some kind of kriging, however I exactly know the magnitude of each 
error at every sampled location, i.e. the value plus or minus the error gave me 
by the laboratory.
Could anyone tell me what kind of function should I use for handling this 
problem?
It should be nice some R package, of course, but I need to understand the 
background theory.
Thanks in advance,
Regards,

Enrico Guastaldi