[
https://issues.apache.org/jira/browse/MAHOUT-634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034945#comment-13034945
]
Dmitriy Lyubimov edited comment on MAHOUT-634 at 5/17/11 6:37 PM:
------------------------------------------------------------------
Ted,
I am also using this with slight modifications to enable to use with
map-reduce. 2 suggestions i implemented on a side: updates to the past
(unordered input w.r.t. to time of sampling, albeit potentially less
numerically stable) and combining to use with MR.
http://weatheringthrutechdays.blogspot.com/2011/04/follow-up-for-mean-summarizer-post.html.
No algorithm in Mahout currently uses MR for summarizing inputs but it might.
These improvements allowed to implement Pig functions that run those formulas.
Also i experimented with yet another biased estimator for binomial sums
(similar to use of beta disitribution as a conjugate prior for binomial
distribution) that allows to converge on a predefined value P_0 (similar to
beta distribution mode converging to 0.5 with n going to 0) under two
circumstances: 1) there's a lack of history (as in beta-distribution-based
estimate). 2) there's lack of _recent_ history.
There's probably no immediate use for either in Mahout but both problems seem
to be pretty common otherwise.
was (Author: dlyubimov):
Ted,
I am also using this with slight modifications to enable to use with
map-reduce. 2 suggestions i implemented on a side: updates to the past
(unordered input w.r.t. to time of sampling, albeit potentially less
numerically stable) and combining to use with MR.
http://weatheringthrutechdays.blogspot.com/2011/04/follow-up-for-mean-summarizer-post.html.
No algorithm in Mahout currently uses MR for summarizing inputs but it might.
These improvements allowed to implement Pig functions that run that formulas.
Also i experimented with yet another biased estimator for binomial sums
(similar to use of beta disitribution as a conjugate prior for binomial
distribution) that allows to converge on a predefined value P_0 (similar to
beta distribution mode converging to 0.5 with n going to 0) under two
circumstances: 1) there's a lack of history (as in beta-distribution-based
estimate). 2) there's lack of _recent_ history.
There's probably no immediate use for either in Mahout but both problems seem
to be pretty common otherwise.
> Need more online averagers
> --------------------------
>
> Key: MAHOUT-634
> URL: https://issues.apache.org/jira/browse/MAHOUT-634
> Project: Mahout
> Issue Type: Improvement
> Affects Versions: 0.4
> Reporter: Ted Dunning
> Assignee: Ted Dunning
> Fix For: 0.5
>
> Attachments: 0001-MAHOUT-634-time-embedded-moving-averages.patch
>
>
> I am occasionally seeing a need to do exponential averaging of values or
> rates.
> Hbase guys want this as well.
> So it is time to do it. I have a patch that does the averaging of values
> according to
> http://tdunning.blogspot.com/2011/03/exponential-weighted-averages-with.html
> I will attach that as a patch now and do the rate averaging as well before
> committing.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira