Jody Garnett ha scritto:
> What a difficult question; is there a strict definition of the quantile 
> function we could grab from statistics or something?

I did not find much, and none of what I've found talks about how to 
handle flat areas in the data histogram:
http://www.gisbanker.com/introduction_part5.htm
http://www.geovista.psu.edu/grants/dg-qg/classing_epi/summary.html
http://www.censusmapper.com/CM_Help/classifyfield.htm
...

> Given you example I want to ask: what is more important; the number of 
> classifications, or the fact that they are "even" in size...

If only the number was important, an equal interval classification
would have been chosen. Quantile is defined by the "even in size",
but given enough flat areas in your data historgram, how do you guess 
what the even size would be?
The method I suggested won't guarantee nor the interval nor the equal
size, but just avoid the silly interval structure... do you have any
suggestion on how to deal with this? What would you do with:

Quantile(  {-1 -2 0 0 0 0 3 5 7 9}, 2) ==> ?
Quantile(  {-1 -2 0 0 0 0 3 5 7 9}, 3) ==> ?

The method I proposed, that is, detect the flat area in the histogram
and avoid breaking the class until you get out of it, would generate
the same result for both:

{-1 -2 0 0 0 0} {3 5 7 9}

For 3 intervals, another non totally silly output could be:

{-1 -2} {0 0 0 0} {3 5 7 9}

Generally speaking, detect flat areas, if they are big enough, make
them a class apart, since they somehow represent an anomaly in the data.
Of course applying this principle you could get more classes than you
asked for. For example:

Quantile(  {-10 -9 -2 0 0 0 1 2 4 9 9 9}, 3) ==> what now?

the "don't break if in flat area" would generate only 2 classes:
{-10 -9 -2 0 0 0} {1 2 4 9 9 9}

the "break out flat areas if big enough" approach would generate 4:
{-10 -9 -2} {0 0 0} {1 2 4} {9 9 9}

> If we go for even in size; you may get 2 categories when you asked for 
> three
> Quantile(  {0 0 0 0 3 5 7 9}, 2) ==> {0 0 0 0}, { 3 5 7 9 }
> Quantile(  {0 0 0 0 3 5 7 9}, 3) ==> {0 0 0 0}, { 3 5 7 9 }
> 
> This may be a strange case of what do you expect? If I am looking at a 
> map of summary of I want to know what the colors represent; and if I ask 
> the application to color equal quantities of data in different colors; 
> for the data you provided we could only make a map with 2 categories; 
> anything else would be a mistake ...
> 
> So while I can think of silly ways to break the content up into {0 0} 
> and {0 0} - they are just that - silly.

Yeah, silly. Unfortunately that's exactly what you're getting today out 
of the quantile classification simple. I have cases, with real data, 
where the current function generates 3 subsequent intervals at 0.

Cheers
Andrea

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Geotools-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/geotools-devel

Reply via email to