The accumulo manual states that combiners can be applied to values which share the same rowID, column family, and column qualifier. Is there any way to adjust this behaviour? I have rows that look like,
000200001ccaac30 meta:size [] 1807 000200001ccaac30 meta:source [] data2 000200001cdaac30 meta:filename [] doc02985453 000200001cdaac30 meta:size [] 656 000200001cdaac30 meta:source [] data2 000200001cfaac30 meta:filename [] doc04484522 000200001cfaac30 meta:size [] 565 000200001cfaac30 meta:source [] data2 000200001dcaac30 meta:filename [] doc03342958 and I'd like to sum up all the values of meta:size across all rows. I know I can scan the sizes and sum them on the client side, but I was hoping there would be a way to do this inside my cluster. Is mapreduce my only option here? Thanks, -Russ
