Re: [GRASS-dev] r.statistics limitation to CELL

Sören Gebbert Sat, 02 Jun 2007 05:57:24 -0700

Hi,

Markus Neteler schrieb:

On Sat, Jun 02, 2007 at 01:00:12PM +0100, Glynn Clements wrote:

Markus Neteler wrote:

would it be much work to fix this:

GRASS 6.3.cvs (nc_spm_05):~ > r.statistics base=landuse96_28m \
               cover=elevation out=elevstats_avg method=average
ERROR: This module currently only works for integer (CELL) maps

Rounding elevation to CELL first isn't a great option.

1. r.statistics works by reclassing the base map, so the base map
can't be FP.


In this case I meant the cover map "elevation" which is rejected.
landuse96_28m is a CELL map, elevation FCELL.

2. r.statistics uses r.stats to calculate the statistics, and r.stats
reads its inputs as CELL.


In this case, r.statistics could also accept an FCELL map
without complaining? Currently I need extra steps to round
elevation to a CELL map before running r.statistics.

r.stats is inherently based upon discrete categories. Even if it reads
FP maps as FP, you would need to quantise the values, otherwise you
would end up with every cell as its own category with count == 1. This
would require memory proportional to the size of the input map
multiplied by a significant factor (48-64 bytes per cell, or even
more).

To handle FP data, you really need a completely new approach which
computes aggregates incrementally, using an accumulator. This would
limit it to aggregates which can be computed that way, e.g. count,
sum, mean, variance and standard deviation.


I darkly remember that Soeren has something already in the works.


I have implemented r3.stats from scratch to handle fp ranges:

GRASS 6.3.cvs > r3.stats help

Description:
 Generates volume statistics for raster3d maps.

Keywords:
 raster3d, statistics

Usage:
 r3.stats [-e] input=name [nsteps=value] [--verbose] [--quiet]

Flags:
  -e   Calculate statistics based on equal value groups
 --v   Verbose module output
 --q   Quiet module output

Parameters:
   input   Name of input raster3d map
  nsteps   Number of sub-ranges to collect stats from
           default: 20


Example:

#region
north:      1000
south:      0
west:       0
east:       2000
top:        10.00000000
bottom:     0.00000000
nsres:      50
nsres3:     50
ewres:      50
ewres3:     50
tbres:      1
rows:       20
rows3:      20
cols:       40
cols3:      40
depths:     10
cells:      800
3dcells:    8000

# create a 3d map
r3.mapcalc "map3d = depth()"

# automatically calculated value range sub-groups

GRASS 6.3.cvs > r3.stats map3d nsteps=5
 100%
  num   | minimum <= value   | value < maximum    |     volume    |   perc  | 
cell count
      1          1.000000000          2.800000000     4000000.000   20.00000    
     1600
      2          2.800000000          4.600000000     4000000.000   20.00000    
     1600
      3          4.600000000          6.400000000     4000000.000   20.00000    
     1600
      4          6.400000000          8.200000000     4000000.000   20.00000    
     1600
      5          8.200000000         10.000000001     4000000.000   20.00000    
     1600
      6                    *                    *           0.000   0.00000     
       0

Sum of non Null cells:
        Volume =  20000000.000
        Percentage = 100.000
        Cell count = 8000

Sum of all cells:
        Volume =  20000000.000
        Percentage = 100.000
        Cell count = 8000


# groups of equal values

                  GRASS 6.3.cvs > r3.stats -e map3d
 100%
Sort non-null values
  num   |        value       |     volume    |   perc  | cell count
      1             1.000000     2000000.000   10.00000          800
      2             2.000000     2000000.000   10.00000          800
      3             3.000000     2000000.000   10.00000          800
      4             4.000000     2000000.000   10.00000          800
      5             5.000000     2000000.000   10.00000          800
      6             6.000000     2000000.000   10.00000          800
      7             7.000000     2000000.000   10.00000          800
      8             8.000000     2000000.000   10.00000          800
      9             9.000000     2000000.000   10.00000          800
     10            10.000000     2000000.000   10.00000          800
     11                    *           0.000   0.00000            0

Number of groups with equal values: 10
Sum of non Null cells:
        Volume =  20000000.000
        Percentage = 100.000
        Cell count = 8000

Sum of all cells:
        Volume =  20000000.000
        Percentage = 100.000
        Cell count = 8000



The approach is based on binary tree search algorithm.
The memory consumption increases with the number of sub-ranges or
the number of groups of equal values.

The map does not need to be read completely in memory.
AFAICT the sub-range search algorithm has an O(nlog(n)) complexity.
I guess the equal value computation can be improved, because the
computation time increases with the number of detected equal value groups.

Sören

[The last two would need to either use the one-pass algorithm (which
can result in negative variance for near-constant data due to rounding
error), or use two passes (computing the mean in the first pass so
that the actual deviations can be used in the second pass). See also:
the history of r.univar.]

As I've mentioned several times before, computing quantiles (e.g.median) of large amounts of floating-point data is an open-ended

problem; any given approach has both pros and cons.

--
Glynn Clements <[EMAIL PROTECTED]>


Markus

_______________________________________________
grass-dev mailing list
[email protected]
http://grass.itc.it/mailman/listinfo/grass-dev


_______________________________________________
grass-dev mailing list
[email protected]
http://grass.itc.it/mailman/listinfo/grass-dev

Re: [GRASS-dev] r.statistics limitation to CELL

Reply via email to