Le Tue, 29 May 2018 15:19:39 +0200, Markus Metz <[email protected]> a écrit :
> On Tue, May 29, 2018 at 2:21 PM, Moritz Lennert < > [email protected]> wrote: > > > > Le Tue, 29 May 2018 08:59:47 +0200, > > Markus Metz <[email protected]> a écrit : > > > > > On Mon, May 28, 2018 at 6:16 PM, Moritz Lennert < > > > [email protected]> wrote: > > > > > > > > Hi, > > > > > > > > AFAIU kernel density calculations, one takes a number of points > > > > and > > > redistributes this total number across the entire region using a > > > specified kernel function as estimator as to the spatial pattern > > > of this redistribution. The total sum should correspond to the > > > total number of points in the input. Is this understanding > > > correct ? > > > > > > > > In v.kernel, this seems to be dependent on the resolution: > > > > > > > > echo "4.5,4.5" | v.in.ascii in=- sep=comma out=testpoint > > > > > > > > g.region n=9 s=0 w=0 e=9 res=1 > > > > v.kernel in=testpoint out=testrast radius=5 kernel=gaussian --o > > > > r.univar testrast > > > > [...] > > > > sum: 0.999544365566944 > > > > > > > > > > > > but > > > > g.region n=9 s=0 w=0 e=9 res=2 > > > > v.kernel in=testpoint out=testrast radius=5 kernel=gaussian --o > > > > r.univar testrastr s > > > > [...] > > > > sum: 0.308567902849234 > > > > > > > > IMHO, the sum should always be close to 1, or ? > > > > > > I think not, because the Gaussian kernel is a general Gaussian > > > function with user-defined sigma = dmax / 4 [0]. The sum would be > > > close to 1 only for a normal function (special case of the > > > Gaussian function) with sigma determined from the observed > > > distances. For the Gaussian kernel, the sum of the output raster > > > should increase with higher resolution and constant sigma. > > > > > > > Thanks. I'll have to think about this a bit more when I have the > > time to fully understand. I imagine there should be a way to > > calculate the multiplier necessary to reach the correct sum > > I don't think there is something like "the correct sum". You are right. Correct is not the right term. However: > > v.kernel provides an estimate for each output cell, how evenly the > input points are distributed around this cell. The definition of > "evenly" depends on the kernel function and bandwidth. I do not have the exact same understanding of kernel estimation: For me it distributes an existing population (defined in GRASS as one per point - others provide option to use a population attribute) across a space. How it is distributed depends on the kernel function and bandwidth. The idea being that for a series of observations in space the exact location of that observation is somewhat random and that you can thus distribute the actual observations across space. If you only have a sample of observations, then you can use a multiplier to estimate total observations as long as your sample is well distributed. But if you have the total population, then you distribute this population in space and the sum across this space should correspond to your population. Here's how they explain it in the ArcGIS manual (hardly a reference I know, but still): "Conceptually, a smoothly curved surface is fitted over each point. The surface value is highest at the location of the point and diminishes with increasing distance from the point, reaching zero at the Search radius distance from the point. Only a circular neighborhood is possible. The volume under the surface equals the Population field value for the point, or 1 if NONE is specified. The density at each output raster cell is calculated by adding the values of all the kernel surfaces where they overlay the raster cell center."[1] Often texts start by saying that the simplest form of the density estimation is the histogram. Kernel functions then allow to make this smoother, but IMU the surface beneath the curve should stay the same. See also the answer by whuber at [2] using the sand analogy. And the reference mentioned in the next post [3], where it is said "In Figure 4‑43, for each point (7,8,9,12 and 14) we have provided a Normal distribution curve with central value (the mean) at the point in question and with an average spread (standard deviation) of 2 units. We can then add the areas under each of these curves together to obtain a (cumulative) curve with two peaks, and then divide this curve by 5 to adjust the area under the curve back to 1 giving the lower red curve shown. When adjusted in this way the values are often described as probability densities, and when extended to two dimensions, the resulting surface is described as a probability density surface, rather than a density surface. We now have a density value for every position along the original line, with smooth transitions between the values, which is exactly what we were trying to achieve." I guess the question is whether we are speaking about normalized or not normalized density ? Moritz [1] https://pro.arcgis.com/en/pro-app/tool-reference/spatial-analyst/how-kernel-density-works.htm#GUID-3BCBF5CA-CAC7-4547-A3CF-B5E30FDE584E [2] https://gis.stackexchange.com/questions/1553/how-to-interpret-grass-v-kernel-results [3] http://www.spatialanalysisonline.com/HTML/?density__kernels_and_occupancy.htm _______________________________________________ grass-dev mailing list [email protected] https://lists.osgeo.org/mailman/listinfo/grass-dev
