Re: [GRASS-dev] v.univar question: Why not lines and areas?

Moritz Lennert Tue, 29 Jan 2008 22:30:43 -0800

On 30/01/08 02:43, Michael Barton wrote:

On Jan 29, 2008, at 5:12 PM, Moritz Lennert wrote:
On 28/01/08 16:22, Michael Barton wrote:
On Jan 28, 2008, at 5:50 AM, Moritz Lennert wrote:
On 27/01/08 20:30, Michael Barton wrote:
v.univar only works with points. But since it is calculating
stats on a field in the attributes table, it should work the same
for all vector objects. Can we get rid of the limitation that it
only works with points?
There was some debate [1] about the statistical validity of working
 with the other types, as the way it was programmed, the statistics
 were calculated with weights which corresponded to line length /
area surface .
I guess we might want to distinguish between a v.univar which works
on the actual vector objects from a v.db.univar which works on any
 arbitrary attribute (or combination of attributes). We could write
a C-replacement of the current v.db.univar script on the base of
the code I have for the classification algorithms used in v.class.
AFAICT, v.univar does not calculate anything from vector topology,
only from an attribute column.
[...]
An attribute is the same whether it's linked to a point, line, or
area.
v.univar currently calculates as follows for lines and areas, eventhough the results are never printed (main.c):
[lines:]
206                             l = Vect_line_length ( Points );
207                             sum += l*val;
208                             sumsq += l*val*val;
209                             sum_abs += l * fabs (val);
210                             total_size += l;

[areas:]
270                             a = Vect_get_area_area ( &Map, area );
271                             sum += a*val;
272                             sumsq += a*val*val;
273                             sum_abs += a * fabs (val);
274                             total_size += a;

285             if ( (otype & GV_LINES) || (otype & GV_AREA) ) {
286                 mean = sum / total_size;
287                 mean_abs = sum_abs / total_size;

So the mean is actually a weighted mean with the area as weight. I don't
really no why Radim coded it like this at the time, and I think we
should change this so that it just uses unweighted feature counts, just
as Roger suggested at the time. Try the attached (untested) patch.
One thing that does potentially matter, though, is whether to use thefeatures or the attribute columns as a base. If you have severalfeatures with the same cat value, this can make a difference, as inthe former case they will all be counted individually, whereas in thelatter case, they will only be counted once. If each of the featureshas an indvididual meaning than the former case seems more correct,but if not (e.g. each island of the Philippines counted separately ina table which lists population by country). Obviously we could justsay that it is up to the user to make sure that the map data iscorrect, i.e. if we take the above example, there should only be onecentroid linked to data per country).
The way the routines are written in v.class, they take an arbitraryarray of floats, so it is up to the individual modules to decide howto create this array.
This is all very interesting. It is a bit worrisome too. I don't want amean of an attribute column weighted by area unless I specifically askfor it. This suggests that people using v.univar may not be getting whatthey think they are getting. I think it is an excellent option, butshould not be a silent default.

Well, since the results are not printed, the problem doesn't reallyexist. The patch I sent doesn't weight at all, just counts features.

How to count the features is a bit of an issue, but couldn't this beleft up to the user too--summarize by cat or by individual feature as anoption?

That's why I think we should have a library function which calculatesstats (i.e. extend what it is the v.class code), and let the modulesdeal with such issues.


Moritz
_______________________________________________
grass-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/grass-dev

Re: [GRASS-dev] v.univar question: Why not lines and areas?

Reply via email to