You could alternatively use a Combiner like the following to calculate the average (though I haven't tested this bit of code). You would configure this as a scan-time iterator (either a persistent scan iterator for the table, or attached to a particular Scanner) and would use the STRING encoding type of the LongCombiner. Not that it would be necessarily better to use a Combiner to average together 7 things, but I thought it would make a good example.
public class AveragingCombiner extends LongCombiner { @Override public Long typedReduce(Key key, Iterator<Long> iter) { long sum = 0; long count = 0; while (iter.hasNext()) { sum = safeAdd(sum, iter.next()); count++; } return sum/count; } } Billie ----- Original Message ----- > From: "David Medinets" <david.medin...@gmail.com> > To: user@accumulo.apache.org > Sent: Wednesday, April 11, 2012 10:59:46 PM > Subject: Using Accumulo To Calculate Seven Day Rolling Average > Thanks. Using this technique seems to work. I wrote a blog entry to > document it: > > Using Accumulo To Calculate Seven Day Rolling Average > http://affy.blogspot.com/2012/04/using-accumulo-to-calculate-seven-day.html > > On Wed, Apr 11, 2012 at 2:20 PM, Adam Fuchs <adam.p.fu...@ugov.gov> > wrote: > > David, > > > > In case of continuing confusion, I think it's best if you ignore > > Bill's > > suggestion for now and heed Josh's advice. Bill's suggestion might > > be an > > optimization to look at later on, but your initial approach seems > > sound. > > > > Adam > > > > > > > > On Tue, Apr 10, 2012 at 10:52 PM, David Medinets > > <david.medin...@gmail.com> > > wrote: > >> > >> I thought there were issues associated with doing mutations inside > >> iterators? > >> > >> On Tue, Apr 10, 2012 at 10:35 PM, William Slacum > >> <wsla...@gmail.com> > >> wrote: > >> > I don't think you'd necessarily need a an aggregator for that, > >> > although > >> > it doesn't seem like that's what you're doing here in the first > >> > place. > >> > Wouldn't it be easier to set a summation iterator that also keeps > >> > a count of > >> > of observations to do some server side math and then combine it > >> > all on the > >> > client? That way you can have a time series and to get weekly > >> > averages you > >> > just change your scan range. > >> > On Apr 10, 2012, at 10:16 PM, David Medinets wrote: > >> > > >> >> I'm still thinking about how to use accumulo to calculate weekly > >> >> moving averages. I thought that using the maxVersions settings > >> >> might > >> >> work to maintain the last 7 values. Then a program could simply > >> >> sum > >> >> the values of a given row. So this is what I did: > >> >> > >> >> bin/accumulo shell -u root -p password > >> >>> createtable rolling > >> >> rolling> config -t rolling -s > >> >> table.iterator.scan.vers.opt.maxVersions=7 > >> >> rolling> insert row cf cq 1 > >> >> rolling> insert row cf cq 2 > >> >> rolling> insert row cf cq 3 > >> >> rolling> insert row cf cq 4 > >> >> rolling> insert row cf cq 5 > >> >> rolling> insert row cf cq 6 > >> >> rolling> insert row cf cq 7 > >> >> rolling> insert row cf cq 8 > >> >> rolling> scan > >> >> row cf:cq [] 8 > >> >> row cf:cq [] 7 > >> >> row cf:cq [] 6 > >> >> row cf:cq [] 5 > >> >> row cf:cq [] 4 > >> >> row cf:cq [] 3 > >> >> row cf:cq [] 2 > >> >> > >> >> This is exactly what I wanted to see. So I wrote a simple > >> >> scanner > >> >> program to read the table. Then I did another scan: > >> >> > >> >> rolling> scan > >> >> row cf:cq [] 8 > >> >> > >> >> Where did the rest of the records go? > >> > > > > >