[ 
https://issues.apache.org/jira/browse/METRON-637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15770199#comment-15770199
 ] 

ASF GitHub Bot commented on METRON-637:
---------------------------------------

Github user cestella commented on the issue:

    https://github.com/apache/incubator-metron/pull/401
  
    I made the corrections suggested and also, as a compromise for performance, 
I changed things up a bit so that the list munging is optimized a bit:
    * A new list of bins is not created per call in either `BIN` or `STATS_BIN`
    * We use the list as passed, converting the `Number` to `Double` lazily.
    * I do the monotonic increasing check as needed rather than prior to the 
function.
    
    All that being said, caching would increase the performance, but I think 
we're in a decent spot at the moment.  I rejiggered the performance driver to 
give us a distribution of performance characteristics.  Current run with this 
change is at:
    `Min/25th/50th/75th/Max Milliseconds: 2687.0 / 2700.5 / 2716.0 / 2733.5 / 
3730.0`
    
    Thoughts?


> Add a STATS_BIN function to Stellar.
> ------------------------------------
>
>                 Key: METRON-637
>                 URL: https://issues.apache.org/jira/browse/METRON-637
>             Project: Metron
>          Issue Type: Improvement
>            Reporter: Casey Stella
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> When passing parameters to models, it's often useful to pass the binned 
> representation of a variable based on an empirical statistical distribution, 
> rather than the actual variable.  This function should accept a set of 
> percentile bins and a statistical sketch and a value.  It should return the 
> index where the percentile of the value falls.
> For instance, consider the value 17 who is percentile 27.  If we use 25, 75, 
> 95 to define our bins, this function would return 1, because its percentile, 
> 27, is between 25 and 75.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to