Hi,
Markus Neteler schrieb:
On Sat, Jun 02, 2007 at 01:00:12PM +0100, Glynn Clements wrote:
Markus Neteler wrote:
would it be much work to fix this:
GRASS 6.3.cvs (nc_spm_05):~ > r.statistics base=landuse96_28m \
cover=elevation out=elevstats_avg method=average
ERROR: This module currently only works for integer (CELL) maps
Rounding elevation to CELL first isn't a great option.
1. r.statistics works by reclassing the base map, so the base map
can't be FP.
In this case I meant the cover map "elevation" which is rejected.
landuse96_28m is a CELL map, elevation FCELL.
2. r.statistics uses r.stats to calculate the statistics, and r.stats
reads its inputs as CELL.
In this case, r.statistics could also accept an FCELL map
without complaining? Currently I need extra steps to round
elevation to a CELL map before running r.statistics.
r.stats is inherently based upon discrete categories. Even if it reads
FP maps as FP, you would need to quantise the values, otherwise you
would end up with every cell as its own category with count == 1. This
would require memory proportional to the size of the input map
multiplied by a significant factor (48-64 bytes per cell, or even
more).
To handle FP data, you really need a completely new approach which
computes aggregates incrementally, using an accumulator. This would
limit it to aggregates which can be computed that way, e.g. count,
sum, mean, variance and standard deviation.
I darkly remember that Soeren has something already in the works.
I have implemented r3.stats from scratch to handle fp ranges:
GRASS 6.3.cvs > r3.stats help
Description:
Generates volume statistics for raster3d maps.
Keywords:
raster3d, statistics
Usage:
r3.stats [-e] input=name [nsteps=value] [--verbose] [--quiet]
Flags:
-e Calculate statistics based on equal value groups
--v Verbose module output
--q Quiet module output
Parameters:
input Name of input raster3d map
nsteps Number of sub-ranges to collect stats from
default: 20
Example:
#region
north: 1000
south: 0
west: 0
east: 2000
top: 10.00000000
bottom: 0.00000000
nsres: 50
nsres3: 50
ewres: 50
ewres3: 50
tbres: 1
rows: 20
rows3: 20
cols: 40
cols3: 40
depths: 10
cells: 800
3dcells: 8000
# create a 3d map
r3.mapcalc "map3d = depth()"
# automatically calculated value range sub-groups
GRASS 6.3.cvs > r3.stats map3d nsteps=5
100%
num | minimum <= value | value < maximum | volume | perc |
cell count
1 1.000000000 2.800000000 4000000.000 20.00000
1600
2 2.800000000 4.600000000 4000000.000 20.00000
1600
3 4.600000000 6.400000000 4000000.000 20.00000
1600
4 6.400000000 8.200000000 4000000.000 20.00000
1600
5 8.200000000 10.000000001 4000000.000 20.00000
1600
6 * * 0.000 0.00000
0
Sum of non Null cells:
Volume = 20000000.000
Percentage = 100.000
Cell count = 8000
Sum of all cells:
Volume = 20000000.000
Percentage = 100.000
Cell count = 8000
# groups of equal values
GRASS 6.3.cvs > r3.stats -e map3d
100%
Sort non-null values
num | value | volume | perc | cell count
1 1.000000 2000000.000 10.00000 800
2 2.000000 2000000.000 10.00000 800
3 3.000000 2000000.000 10.00000 800
4 4.000000 2000000.000 10.00000 800
5 5.000000 2000000.000 10.00000 800
6 6.000000 2000000.000 10.00000 800
7 7.000000 2000000.000 10.00000 800
8 8.000000 2000000.000 10.00000 800
9 9.000000 2000000.000 10.00000 800
10 10.000000 2000000.000 10.00000 800
11 * 0.000 0.00000 0
Number of groups with equal values: 10
Sum of non Null cells:
Volume = 20000000.000
Percentage = 100.000
Cell count = 8000
Sum of all cells:
Volume = 20000000.000
Percentage = 100.000
Cell count = 8000
The approach is based on binary tree search algorithm.
The memory consumption increases with the number of sub-ranges or
the number of groups of equal values.
The map does not need to be read completely in memory.
AFAICT the sub-range search algorithm has an O(nlog(n)) complexity.
I guess the equal value computation can be improved, because the
computation time increases with the number of detected equal value groups.
Sören
[The last two would need to either use the one-pass algorithm (which
can result in negative variance for near-constant data due to rounding
error), or use two passes (computing the mean in the first pass so
that the actual deviations can be used in the second pass). See also:
the history of r.univar.]
As I've mentioned several times before, computing quantiles (e.g.
median) of large amounts of floating-point data is an open-ended
problem; any given approach has both pros and cons.
--
Glynn Clements <[EMAIL PROTECTED]>
Markus
_______________________________________________
grass-dev mailing list
[email protected]
http://grass.itc.it/mailman/listinfo/grass-dev
_______________________________________________
grass-dev mailing list
[email protected]
http://grass.itc.it/mailman/listinfo/grass-dev