[
https://issues.apache.org/jira/browse/METRON-637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15769392#comment-15769392
]
ASF GitHub Bot commented on METRON-637:
---------------------------------------
Github user mattf-horton commented on a diff in the pull request:
https://github.com/apache/incubator-metron/pull/401#discussion_r93575237
--- Diff:
metron-analytics/metron-statistics/src/test/java/org/apache/metron/statistics/StellarStatisticsFunctionsTest.java
---
@@ -373,15 +373,16 @@ public void statsBinRunner(List<Double> splits)
throws Exception {
public void statsBinRunner(List<Double> splits, String splitsName)
throws Exception {
int bin = 0;
+ StatisticsProvider provider =
(StatisticsProvider)variables.get("stats");
for(Double d : stats.getSortedValues()) {
- StatisticsProvider provider =
(StatisticsProvider)variables.get("stats");
if(bin < splits.size()) {
double percentileOfBin = provider.getPercentile(splits.get(bin));
if (d > percentileOfBin) {
//we aren't the right bin, so let's find the right one.
// Keep in mind that this value could be more than one bin away
from the last good bin.
- for(;bin < splits.size() && d >
provider.getPercentile(splits.get(bin));bin++) {
-
+ while ( bin < splits.size() && d >
provider.getPercentile(splits.get(bin)) ) {
+ //increment the bin number until it includes the target value,
or we run out of bins
+ bin++;
--- End diff --
This whole block:
```
if(bin < splits.size()) {
double percentileOfBin = provider.getPercentile(splits.get(bin));
if (d > percentileOfBin) {
//we aren't the right bin, so let's find the right one.
// Keep in mind that this value could be more than one bin away
from the last good bin.
while ( bin < splits.size() && d >
provider.getPercentile(splits.get(bin)) ) {
//increment the bin number until it includes the target value,
or we run out of bins
bin++;
}
}
}
```
can be replaced by:
```
while ( bin < splits.size() && d >
provider.getPercentile(splits.get(bin)) ) {
//increment the bin number until it includes the target value, or
we run out of bins
bin++;
}
```
> Add a STATS_BIN function to Stellar.
> ------------------------------------
>
> Key: METRON-637
> URL: https://issues.apache.org/jira/browse/METRON-637
> Project: Metron
> Issue Type: Improvement
> Reporter: Casey Stella
> Original Estimate: 48h
> Remaining Estimate: 48h
>
> When passing parameters to models, it's often useful to pass the binned
> representation of a variable based on an empirical statistical distribution,
> rather than the actual variable. This function should accept a set of
> percentile bins and a statistical sketch and a value. It should return the
> index where the percentile of the value falls.
> For instance, consider the value 17 who is percentile 27. If we use 25, 75,
> 95 to define our bins, this function would return 1, because its percentile,
> 27, is between 25 and 75.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)