[
https://issues.apache.org/jira/browse/METRON-637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15815076#comment-15815076
]
ASF GitHub Bot commented on METRON-637:
---------------------------------------
Github user cestella commented on a diff in the pull request:
https://github.com/apache/incubator-metron/pull/401#discussion_r95371355
--- Diff:
metron-analytics/metron-statistics/src/test/java/org/apache/metron/statistics/StellarStatisticsFunctionsTest.java
---
@@ -356,6 +357,49 @@ public void testSkewness() throws Exception {
assertEquals(stats.getSkewness(), (Double) actual, 0.1);
}
+ @Test
+ public void testStatsBin() throws Exception {
--- End diff --
This test is testing that the `STATS_BIN` function operates correctly by
taking a sorted list of numbers, walking down it and ensuring that the
`STATS_BIN` for each number yields the correct bin. This is a reasonable test
because we are not actually computing the bin so much as recognizing since the
numbers are sorted, the bin will increase at the percentile boundaries, thus we
have the expected bin without recreating the computation in the `STATS_BIN`
function.
> Add a STATS_BIN function to Stellar.
> ------------------------------------
>
> Key: METRON-637
> URL: https://issues.apache.org/jira/browse/METRON-637
> Project: Metron
> Issue Type: Improvement
> Reporter: Casey Stella
> Original Estimate: 48h
> Remaining Estimate: 48h
>
> When passing parameters to models, it's often useful to pass the binned
> representation of a variable based on an empirical statistical distribution,
> rather than the actual variable. This function should accept a set of
> percentile bins and a statistical sketch and a value. It should return the
> index where the percentile of the value falls.
> For instance, consider the value 17 who is percentile 27. If we use 25, 75,
> 95 to define our bins, this function would return 1, because its percentile,
> 27, is between 25 and 75.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)