Go for option #2 and use the combiners. It's one of the core features of Accumulo and the overhead at insert-time is minimal. Developer time overhead is also minimal-- add a couple lines next to where you make your mutations and you're done.
Regards, Dylan On Wed, Aug 26, 2015 at 6:11 PM, z11373 <[email protected]> wrote: > Hi, > Apologize if this question has been asked before (which I am kind of > certain). > I am building a triple store, and need to build the stats table which will > be used for query optimization (i.e. re-order the query triple pattern). > There may be more than 2 solutions for this, but the two I know are: > 1. Manually rebuild the whole stats, this can be run once per day for > example > This option would be expensive because we are re-calculating all rows in > master table, but the end result is no more computation when we retrieve > the > stat info. For example, we'll just query stats table for word 'foo', and > it'll return a single row with total items for that word. > > 2. Use Accumulo combiner > With this option, we could simply add the counter to the stats table (i.e. > insert ['foo', 1]) whenever we insert 'foo' to master table. When we want > to > get the stat info during query time, Accumulo will actually aggregate all > the count for that word 'foo' in map-reduce fashion. > For #2, we pay the cost during scan time, but if the rows that have word > 'foo' only in hundredth, I guess it won't be so bad, because that > aggregation will be done on the server side (and it'd be optimized due to > Accumulo design) > > I prefer option #2, but not sure how expensive is that on Accumulo, > especially we'll do a big number of queries per day, than that stats > re-calculating process which is once per day. Any comments on this? > Please let me know if my problem statement or the question is unclear. > > > Thanks, > Z > > > > -- > View this message in context: > http://apache-accumulo.1065345.n5.nabble.com/using-combiner-vs-building-stats-cache-tp14979.html > Sent from the Developers mailing list archive at Nabble.com. >
