Re: [Geotools-devel] Raster classifier operation for large inputs

2018-11-14 Thread Justin Deoliveira
Ahhh sorry, totally missed that, my apologies, too much multitasking on my
part :) Anyways, +1 from me fwiw.

On Wed, Nov 14, 2018 at 11:59 AM Andrea Aime 
wrote:

> On Wed, Nov 14, 2018 at 7:54 PM Justin Deoliveira 
> wrote:
>
>> Hey Andrea,
>> All of your changes sound good to me. Only question I have is whether
>> your proposed change will replace what is there? Or is your thought to add
>> some config parameter that would trigger the histogram / approximation
>> based method?
>>
>
> New entry in the methods enum, to be used from the caller when approximate
> calcuation is desirable.
> Citing from my initial mail (yes, it was a bit of a wall of text):
>
> " Ideally, these would be new entries in the ClassificationMethod
> enumeration, say
> QUANTILE_HISTOGRAM and NATURAL_BREAKS_HISTOGRAM, and ClassBreaksOpImage
> would have an
> extra optional parameter to decide the bucket count (with some reasonable
> defaults, e.g. 256 for byte data,
> 1000 for any other type)."
>
>
>> As for moving the code to jai-text definitely makes sense to me.
>>
>
> Great, thanks for following up!
>
> Cheers
> Andrea
>
> ==
>
> GeoServer Professional Services from the experts! Visit
> http://goo.gl/it488V for more information. == Ing. Andrea Aime @geowolf
> Technical Lead GeoSolutions S.A.S. Via di Montramito 3/A 55054 Massarosa
> (LU) phone: +39 0584 962313 fax: +39 0584 1660272 mob: +39 339 8844549
> http://www.geo-solutions.it http://twitter.com/geosolutions_it
> --- *Con riferimento
> alla normativa sul trattamento dei dati personali (Reg. UE 2016/679 -
> Regolamento generale sulla protezione dei dati “GDPR”), si precisa che ogni
> circostanza inerente alla presente email (il suo contenuto, gli eventuali
> allegati, etc.) è un dato la cui conoscenza è riservata al/i solo/i
> destinatario/i indicati dallo scrivente. Se il messaggio Le è giunto per
> errore, è tenuta/o a cancellarlo, ogni altra operazione è illecita. Le
> sarei comunque grato se potesse darmene notizia. This email is intended
> only for the person or entity to which it is addressed and may contain
> information that is privileged, confidential or otherwise protected from
> disclosure. We remind that - as provided by European Regulation 2016/679
> “GDPR” - copying, dissemination or use of this e-mail or the information
> herein by anyone other than the intended recipient is prohibited. If you
> have received this email by mistake, please notify us immediately by
> telephone or e-mail.*
>
___
GeoTools-Devel mailing list
GeoTools-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geotools-devel


Re: [Geotools-devel] Raster classifier operation for large inputs

2018-11-14 Thread Andrea Aime
On Wed, Nov 14, 2018 at 7:54 PM Justin Deoliveira 
wrote:

> Hey Andrea,
> All of your changes sound good to me. Only question I have is whether your
> proposed change will replace what is there? Or is your thought to add some
> config parameter that would trigger the histogram / approximation based
> method?
>

New entry in the methods enum, to be used from the caller when approximate
calcuation is desirable.
Citing from my initial mail (yes, it was a bit of a wall of text):

" Ideally, these would be new entries in the ClassificationMethod
enumeration, say
QUANTILE_HISTOGRAM and NATURAL_BREAKS_HISTOGRAM, and ClassBreaksOpImage
would have an
extra optional parameter to decide the bucket count (with some reasonable
defaults, e.g. 256 for byte data,
1000 for any other type)."


> As for moving the code to jai-text definitely makes sense to me.
>

Great, thanks for following up!

Cheers
Andrea

==

GeoServer Professional Services from the experts! Visit http://goo.gl/it488V
for more information. == Ing. Andrea Aime @geowolf Technical Lead
GeoSolutions S.A.S. Via di Montramito 3/A 55054 Massarosa (LU) phone: +39
0584 962313 fax: +39 0584 1660272 mob: +39 339 8844549
http://www.geo-solutions.it http://twitter.com/geosolutions_it
--- *Con riferimento
alla normativa sul trattamento dei dati personali (Reg. UE 2016/679 -
Regolamento generale sulla protezione dei dati “GDPR”), si precisa che ogni
circostanza inerente alla presente email (il suo contenuto, gli eventuali
allegati, etc.) è un dato la cui conoscenza è riservata al/i solo/i
destinatario/i indicati dallo scrivente. Se il messaggio Le è giunto per
errore, è tenuta/o a cancellarlo, ogni altra operazione è illecita. Le
sarei comunque grato se potesse darmene notizia. This email is intended
only for the person or entity to which it is addressed and may contain
information that is privileged, confidential or otherwise protected from
disclosure. We remind that - as provided by European Regulation 2016/679
“GDPR” - copying, dissemination or use of this e-mail or the information
herein by anyone other than the intended recipient is prohibited. If you
have received this email by mistake, please notify us immediately by
telephone or e-mail.*
___
GeoTools-Devel mailing list
GeoTools-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geotools-devel


Re: [Geotools-devel] Raster classifier operation for large inputs

2018-11-14 Thread Simone Giannecchini
For what it's worth, I like the plan.
Improving current code, making it shareable between multiple parts of the
codebase by pushing back to JAI-Ext

It will be nice to have in the SLDService.

Regards,
Simone Giannecchini
==
GeoServer Professional Services from the experts!
Visit http://goo.gl/it488V for more information.
==
Ing. Simone Giannecchini
@simogeo
Founder/Director

GeoSolutions S.A.S.
Via di Montramito 3/A
55054  Massarosa (LU)
Italy
phone: +39 0584 962313
fax: +39 0584 1660272
mob:   +39  333 8128928

http://www.geo-solutions.it
http://twitter.com/geosolutions_it

---
Con riferimento alla normativa sul trattamento dei dati personali (Reg. UE
2016/679 - Regolamento generale sulla protezione dei dati “GDPR”), si
precisa che ogni circostanza inerente alla presente email (il suo
contenuto, gli eventuali allegati, etc.) è un dato la cui conoscenza è
riservata al/i solo/i destinatario/i indicati dallo scrivente. Se il
messaggio Le è giunto per errore, è tenuta/o a cancellarlo, ogni altra
operazione è illecita. Le sarei comunque grato se potesse darmene notizia.

This email is intended only for the person or entity to which it is
addressed and may contain information that is privileged, confidential or
otherwise protected from disclosure. We remind that - as provided by
European Regulation 2016/679 “GDPR” - copying, dissemination or use of this
e-mail or the information herein by anyone other than the intended
recipient is prohibited. If you have received this email by mistake, please
notify us immediately by telephone or e-mail.


On Wed, Nov 14, 2018 at 5:01 PM Andrea Aime 
wrote:

> Hi,
> I'm looking into extending the GeoServer SLDService API to work against
> raster data too.
> The current code in that module works off the vector classification
> functions for equal intervals, natural breaks and quantiles.
>
> When looking at extending it for rasters, I stumbled into
> the ClassBreaksOpImage and its subclasses, which
> does more or less what I need... with a hitch though: the rasters that I'm
> playing with can be large and have floats/doubles
>
> Looking at the implementation for quantilies and natural breaks I've
> noticed that all input values get collected
> either in a List or a Map, where the doubles are
> the values and the integer is a pixel count.
> Mind, this is the same as vector code is doing, but getting to a million
> of those in raster space only requires a 1000x1000
> image... and millions of double values (or map entries) take up a lot of
> space. I could look into using non boxed
> variants, but the issue is not really that one, it's just that keeping
> track of all values requires too much space.
>
> So I'd like to add an approximate calculator instead that collects
> histograms, and the works off the result applying the
> same logic as today. Ideally, these would be new entries in
> the ClassificationMethod enumeration, say
> QUANTILE_HISTOGRAM and NATURAL_BREAKS_HISTOGRAM, and ClassBreaksOpImage
> would have an
> extra optional parameter to decide the bucket count (with some reasonable
> defaults, e.g. 256 for byte data,
> 1000 for any other type).
> Working off histograms has a clear benefit, the size of the working memory
> is fixed at the start, and it's possible
> to use primitives in the data structure, of course it also means the
> resulting classification won't be exact, but
> should be close enough.
> The downside is that the min/max values need to be known in advance to
> build the buckets, so for the histogram
> based methods the "extrema" parameter in the ClassBreaksOpImage will be
> mandatory (exception thrown if not provided).
>
> How does that sound?
>
> Cheers
> Andrea
>
> PS: most operations are in jai-ext, mind if the ClassBreaksOpImage gets
> moved there, in its own module?
>
> --
>
> Regards, Andrea Aime == GeoServer Professional Services from the experts!
> Visit http://goo.gl/it488V for more information. == Ing. Andrea Aime
> @geowolf Technical Lead GeoSolutions S.A.S. Via di Montramito 3/A 55054
> Massarosa (LU) phone: +39 0584 962313 fax: +39 0584 1660272 mob: +39 339
> 8844549 http://www.geo-solutions.it http://twitter.com/geosolutions_it
> --- *Con riferimento
> alla normativa sul trattamento dei dati personali (Reg. UE 2016/679 -
> Regolamento generale sulla protezione dei dati “GDPR”), si precisa che ogni
> circostanza inerente alla presente email (il suo contenuto, gli eventuali
> allegati, etc.) è un dato la cui conoscenza è riservata al/i solo/i
> destinatario/i indicati dallo scrivente. Se il messaggio Le è giunto per
> errore, è tenuta/o a cancellarlo, ogni altra operazione è illecita. Le
> sarei comunque grato se potesse darmene notizia. This email is intended
> only for the person or entity to which it is addressed and may contain
> information that is privileged, confidential or otherwise protected from
> disclosure. We remind that

Re: [Geotools-devel] Raster classifier operation for large inputs

2018-11-14 Thread Justin Deoliveira
Hey Andrea,

All of your changes sound good to me. Only question I have is whether your
proposed change will replace what is there? Or is your thought to add some
config parameter that would trigger the histogram / approximation based
method?

As for moving the code to jai-text definitely makes sense to me.

-Justin

On Wed, Nov 14, 2018 at 10:00 AM Andrea Aime 
wrote:

> Hi,
> I'm looking into extending the GeoServer SLDService API to work against
> raster data too.
> The current code in that module works off the vector classification
> functions for equal intervals, natural breaks and quantiles.
>
> When looking at extending it for rasters, I stumbled into
> the ClassBreaksOpImage and its subclasses, which
> does more or less what I need... with a hitch though: the rasters that I'm
> playing with can be large and have floats/doubles
>
> Looking at the implementation for quantilies and natural breaks I've
> noticed that all input values get collected
> either in a List or a Map, where the doubles are
> the values and the integer is a pixel count.
> Mind, this is the same as vector code is doing, but getting to a million
> of those in raster space only requires a 1000x1000
> image... and millions of double values (or map entries) take up a lot of
> space. I could look into using non boxed
> variants, but the issue is not really that one, it's just that keeping
> track of all values requires too much space.
>
> So I'd like to add an approximate calculator instead that collects
> histograms, and the works off the result applying the
> same logic as today. Ideally, these would be new entries in
> the ClassificationMethod enumeration, say
> QUANTILE_HISTOGRAM and NATURAL_BREAKS_HISTOGRAM, and ClassBreaksOpImage
> would have an
> extra optional parameter to decide the bucket count (with some reasonable
> defaults, e.g. 256 for byte data,
> 1000 for any other type).
> Working off histograms has a clear benefit, the size of the working memory
> is fixed at the start, and it's possible
> to use primitives in the data structure, of course it also means the
> resulting classification won't be exact, but
> should be close enough.
> The downside is that the min/max values need to be known in advance to
> build the buckets, so for the histogram
> based methods the "extrema" parameter in the ClassBreaksOpImage will be
> mandatory (exception thrown if not provided).
>
> How does that sound?
>
> Cheers
> Andrea
>
> PS: most operations are in jai-ext, mind if the ClassBreaksOpImage gets
> moved there, in its own module?
>
> --
>
> Regards, Andrea Aime == GeoServer Professional Services from the experts!
> Visit http://goo.gl/it488V for more information. == Ing. Andrea Aime
> @geowolf Technical Lead GeoSolutions S.A.S. Via di Montramito 3/A 55054
> Massarosa (LU) phone: +39 0584 962313 fax: +39 0584 1660272 mob: +39 339
> 8844549 http://www.geo-solutions.it http://twitter.com/geosolutions_it
> --- *Con riferimento
> alla normativa sul trattamento dei dati personali (Reg. UE 2016/679 -
> Regolamento generale sulla protezione dei dati “GDPR”), si precisa che ogni
> circostanza inerente alla presente email (il suo contenuto, gli eventuali
> allegati, etc.) è un dato la cui conoscenza è riservata al/i solo/i
> destinatario/i indicati dallo scrivente. Se il messaggio Le è giunto per
> errore, è tenuta/o a cancellarlo, ogni altra operazione è illecita. Le
> sarei comunque grato se potesse darmene notizia. This email is intended
> only for the person or entity to which it is addressed and may contain
> information that is privileged, confidential or otherwise protected from
> disclosure. We remind that - as provided by European Regulation 2016/679
> “GDPR” - copying, dissemination or use of this e-mail or the information
> herein by anyone other than the intended recipient is prohibited. If you
> have received this email by mistake, please notify us immediately by
> telephone or e-mail.*
>
___
GeoTools-Devel mailing list
GeoTools-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geotools-devel


[Geotools-devel] Raster classifier operation for large inputs

2018-11-14 Thread Andrea Aime
Hi,
I'm looking into extending the GeoServer SLDService API to work against
raster data too.
The current code in that module works off the vector classification
functions for equal intervals, natural breaks and quantiles.

When looking at extending it for rasters, I stumbled into
the ClassBreaksOpImage and its subclasses, which
does more or less what I need... with a hitch though: the rasters that I'm
playing with can be large and have floats/doubles

Looking at the implementation for quantilies and natural breaks I've
noticed that all input values get collected
either in a List or a Map, where the doubles are
the values and the integer is a pixel count.
Mind, this is the same as vector code is doing, but getting to a million of
those in raster space only requires a 1000x1000
image... and millions of double values (or map entries) take up a lot of
space. I could look into using non boxed
variants, but the issue is not really that one, it's just that keeping
track of all values requires too much space.

So I'd like to add an approximate calculator instead that collects
histograms, and the works off the result applying the
same logic as today. Ideally, these would be new entries in
the ClassificationMethod enumeration, say
QUANTILE_HISTOGRAM and NATURAL_BREAKS_HISTOGRAM, and ClassBreaksOpImage
would have an
extra optional parameter to decide the bucket count (with some reasonable
defaults, e.g. 256 for byte data,
1000 for any other type).
Working off histograms has a clear benefit, the size of the working memory
is fixed at the start, and it's possible
to use primitives in the data structure, of course it also means the
resulting classification won't be exact, but
should be close enough.
The downside is that the min/max values need to be known in advance to
build the buckets, so for the histogram
based methods the "extrema" parameter in the ClassBreaksOpImage will be
mandatory (exception thrown if not provided).

How does that sound?

Cheers
Andrea

PS: most operations are in jai-ext, mind if the ClassBreaksOpImage gets
moved there, in its own module?

-- 

Regards, Andrea Aime == GeoServer Professional Services from the experts!
Visit http://goo.gl/it488V for more information. == Ing. Andrea Aime
@geowolf Technical Lead GeoSolutions S.A.S. Via di Montramito 3/A 55054
Massarosa (LU) phone: +39 0584 962313 fax: +39 0584 1660272 mob: +39 339
8844549 http://www.geo-solutions.it http://twitter.com/geosolutions_it
--- *Con riferimento
alla normativa sul trattamento dei dati personali (Reg. UE 2016/679 -
Regolamento generale sulla protezione dei dati “GDPR”), si precisa che ogni
circostanza inerente alla presente email (il suo contenuto, gli eventuali
allegati, etc.) è un dato la cui conoscenza è riservata al/i solo/i
destinatario/i indicati dallo scrivente. Se il messaggio Le è giunto per
errore, è tenuta/o a cancellarlo, ogni altra operazione è illecita. Le
sarei comunque grato se potesse darmene notizia. This email is intended
only for the person or entity to which it is addressed and may contain
information that is privileged, confidential or otherwise protected from
disclosure. We remind that - as provided by European Regulation 2016/679
“GDPR” - copying, dissemination or use of this e-mail or the information
herein by anyone other than the intended recipient is prohibited. If you
have received this email by mistake, please notify us immediately by
telephone or e-mail.*
___
GeoTools-Devel mailing list
GeoTools-Devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geotools-devel