Re: [GRASS-dev] [GRASS GIS] #3198: r.stats.quantile: hardcoded max number of categries in base map

2017-04-05 Thread GRASS GIS
#3198: r.stats.quantile: hardcoded max number of categries in base map
--+---
  Reporter:  mlennert |  Owner:  grass-dev@…
  Type:  defect   | Status:  closed
  Priority:  normal   |  Milestone:  7.2.1
 Component:  Raster   |Version:  unspecified
Resolution:  fixed|   Keywords:  r.stats.quantile MAX_CATS
   CPU:  Unspecified  |   Platform:  Unspecified
--+---
Changes (by mlennert):

 * status:  new => closed
 * resolution:   => fixed


Comment:

 Closing this as the original issue seems to be fixed. As the solution is
 not a simple bug fix, it should probably not go into 7.2.

--
Ticket URL: 
GRASS GIS 

___
grass-dev mailing list
grass-dev@lists.osgeo.org
https://lists.osgeo.org/mailman/listinfo/grass-dev

Re: [GRASS-dev] [GRASS GIS] #3198: r.stats.quantile: hardcoded max number of categries in base map

2016-11-08 Thread GRASS GIS
#3198: r.stats.quantile: hardcoded max number of categries in base map
--+---
  Reporter:  mlennert |  Owner:  grass-dev@…
  Type:  defect   | Status:  new
  Priority:  normal   |  Milestone:  7.2.1
 Component:  Raster   |Version:  unspecified
Resolution:   |   Keywords:  r.stats.quantile MAX_CATS
   CPU:  Unspecified  |   Platform:  Unspecified
--+---

Comment (by mmetz):

 Replying to [comment:2 mlennert]:
 > Replying to [comment:1 glynn]:
 > > [...]
 > >
 > > There's no fundamental reason why the limit can't be raised; or even
 abolished, if you don't mind an unsuitable choice of base map resulting in
 "unable to allocate" errors, or just taking forever.
 >
 > A warning was maintained. At least the user is made aware and can stop
 the module.

 FWIW, I tested with more than a million categories in the base map and the
 module finished within 19 seconds (on an old laptop).
 >
 > > Consider putting a limit on num_cats*num_slots; a map with many
 categories should presumably require fewer bins (assuming that the data
 isn't concentrated into a handful of categories).
 >
 > In r69776 MarkusM introduce dynamic bins, although I don't really
 understand what this means ;-).

 For example, if there are only 10 cells for a given basemap category, it
 does not make sense to allocate 1000 bins for that category, instead a
 single bin is sufficient. With many basemap categories and only few values
 for each category, memory consumption can be reduced by 90% down to 10% of
 the previous version of r.stats.quantile. Still, with many basemap
 categories and many cells per category, the module will be slow and will
 need a lot of memory.

--
Ticket URL: 
GRASS GIS 

___
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev

Re: [GRASS-dev] [GRASS GIS] #3198: r.stats.quantile: hardcoded max number of categries in base map

2016-11-07 Thread GRASS GIS
#3198: r.stats.quantile: hardcoded max number of categries in base map
--+---
  Reporter:  mlennert |  Owner:  grass-dev@…
  Type:  defect   | Status:  new
  Priority:  normal   |  Milestone:  7.2.1
 Component:  Raster   |Version:  unspecified
Resolution:   |   Keywords:  r.stats.quantile MAX_CATS
   CPU:  Unspecified  |   Platform:  Unspecified
--+---

Comment (by mlennert):

 Replying to [comment:1 glynn]:
 > Replying to [ticket:3198 mlennert]:
 >
 > > Is there any specific reason for this ? I would like to use
 r.stats.quantile in i.segment.stats to calculate percentiles per segment,
 but number of segments can be much higher than 1000.
 >
 > The limit was added so that if someone tries to use a base map with a
 million categories, it just fails quickly, rather than attempting
 something which will either exhaust memory or take days to run.
 >
 > For each category in the base map, it allocates a basecat structure,
 each of which references several dynamically-allocated arrays. The .slots
 and .slot_bins arrays are sized based upon the bins= option, the .values
 array is sized to hold all of the values falling into any bin containing
 to a quantile, the .quants and .bins arrays according to the number of
 quantiles.
 >
 > As well as the memory consumption, almost all processing is per-
 category.
 >
 > Having said that, more categories will tend to result in less data per
 category. However, there are some non-trivial per-category overheads. On
 the other hand, sorting the bins containing quantiles should be faster
 overall with more bins but proportionally less data in each bin.
 >
 > There's no fundamental reason why the limit can't be raised; or even
 abolished, if you don't mind an unsuitable choice of base map resulting in
 "unable to allocate" errors, or just taking forever.

 A warning was maintained. At least the user is made aware and can stop the
 module.

 > Consider putting a limit on num_cats*num_slots; a map with many
 categories should presumably require fewer bins (assuming that the data
 isn't concentrated into a handful of categories).

 In r69776 MarkusM introduce dynamic bins, although I don't really
 understand what this means ;-).

 More generally: the man page of r.stats.quantile does lack a bit of info
 about its parameters, notably the 'bin' parameter. A short paragraph
 explaining how the module works would be useful.

--
Ticket URL: 
GRASS GIS 

___
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev

Re: [GRASS-dev] [GRASS GIS] #3198: r.stats.quantile: hardcoded max number of categries in base map

2016-11-07 Thread GRASS GIS
#3198: r.stats.quantile: hardcoded max number of categries in base map
--+---
  Reporter:  mlennert |  Owner:  grass-dev@…
  Type:  defect   | Status:  new
  Priority:  normal   |  Milestone:  7.2.1
 Component:  Raster   |Version:  unspecified
Resolution:   |   Keywords:  r.stats.quantile MAX_CATS
   CPU:  Unspecified  |   Platform:  Unspecified
--+---

Comment (by glynn):

 Replying to [ticket:3198 mlennert]:

 > Is there any specific reason for this ? I would like to use
 r.stats.quantile in i.segment.stats to calculate percentiles per segment,
 but number of segments can be much higher than 1000.

 The limit was added so that if someone tries to use a base map with a
 million categories, it just fails quickly, rather than attempting
 something which will either exhaust memory or take days to run.

 For each category in the base map, it allocates a basecat structure, each
 of which references several dynamically-allocated arrays. The .slots and
 .slot_bins arrays are sized based upon the bins= option, the .values array
 is sized to hold all of the values falling into any bin containing to a
 quantile, the .quants and .bins arrays according to the number of
 quantiles.

 As well as the memory consumption, almost all processing is per-category.

 Having said that, more categories will tend to result in less data per
 category. However, there are some non-trivial per-category overheads. On
 the other hand, sorting the bins containing quantiles should be faster
 overall with more bins but proportionally less data in each bin.

 There's no fundamental reason why the limit can't be raised; or even
 abolished, if you don't mind an unsuitable choice of base map resulting in
 "unable to allocate" errors, or just taking forever. Consider putting a
 limit on num_cats*num_slots; a map with many categories should presumably
 require fewer bins (assuming that the data isn't concentrated into a
 handful of categories).

--
Ticket URL: 
GRASS GIS 

___
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev

[GRASS-dev] [GRASS GIS] #3198: r.stats.quantile: hardcoded max number of categries in base map

2016-11-03 Thread GRASS GIS
#3198: r.stats.quantile: hardcoded max number of categries in base map
---+-
 Reporter:  mlennert   |  Owner:  grass-dev@…
 Type:  defect | Status:  new
 Priority:  normal |  Milestone:  7.2.1
Component:  Raster |Version:  unspecified
 Keywords:  r.stats.quantile MAX_CATS  |CPU:  Unspecified
 Platform:  Unspecified|
---+-
 r.stats.quantile
 
[https://trac.osgeo.org/grass/browser/grass/trunk/raster/r.stats.quantile/main.c#L21
 limits] the number of categories the base map can have to 1000 through a
 MAX_CATS variable.

 Is there any specific reason for this ? I would like to use
 r.stats.quantile in i.segment.stats to calculate percentiles per segment,
 but number of segments can be much higher than 1000.

 Classifying this as a "bug" for now...

--
Ticket URL: 
GRASS GIS 

___
grass-dev mailing list
grass-dev@lists.osgeo.org
http://lists.osgeo.org/mailman/listinfo/grass-dev