I've been playing around with the LongCombiner on a table that's summing up the counts of output of a MapReduce job, very similar to the WordCount example from the user manual.
I started out encoding the values using LongCombiner.FIXED_LEN_ENCODER, but have noticed that this can lead to some confusion later on downstream. For example, a co-worker was scanning using the shell and was caught off guard by the encoded values. Also, out of the box, the StatsCombiner example works using String values, not Long values so we built a custom piece to essentially do the same thing with Long values instead. It looks to me like most of the examples I've seen just store things are String values, rather than encoding them. What are the tradeoffs? We're at a point where we could pretty easily switch things to just use strings - it seems like that might make things more convenient from a maintenance perspective (human readable values) and would allow us to re-use some existing components (e.g. StatsCombiner). Any thoughts? Thanks, Mike
