[ 
https://issues.apache.org/jira/browse/MAHOUT-634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034945#comment-13034945
 ] 

Dmitriy Lyubimov commented on MAHOUT-634:
-----------------------------------------

Ted, 

I am also using this with slight modifications to enable to use with 
map-reduce. 2 suggestions i implemented on a side: updates to the past  
(unordered input w.r.t. to time of sampling, albeit potentially less 
numerically stable) and combining to use with MR. 
http://weatheringthrutechdays.blogspot.com/2011/04/follow-up-for-mean-summarizer-post.html.
 No algorithm in Mahout currently uses MR for summarizing inputs but it might. 
These improvements allowed to implement Pig functions that run that formulas.

Also i experimented with yet another biased estimator for binomial sums 
(similar to use of beta disitribution as a conjugate prior for binomial 
distribution) that allows to converge on a predefined value P_0 (similar to 
beta distribution mode converging to 0.5 with n going to 0) under two 
circumstances: 1) there's a lack of history (as in beta-distribution-based 
estimate). 2) there's lack of _recent_ history. 

There's probably no immediate use for either in Mahout but both problems seem 
to be pretty common otherwise.


> Need more online averagers
> --------------------------
>
>                 Key: MAHOUT-634
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-634
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.4
>            Reporter: Ted Dunning
>            Assignee: Ted Dunning
>             Fix For: 0.5
>
>         Attachments: 0001-MAHOUT-634-time-embedded-moving-averages.patch
>
>
> I am occasionally seeing a need to do exponential averaging of values or 
> rates.
> Hbase guys want this as well.
> So it is time to do it.  I have a patch that does the averaging of values 
> according to
> http://tdunning.blogspot.com/2011/03/exponential-weighted-averages-with.html
> I will attach that as a patch now and do the rate averaging as well before 
> committing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to