Thanks - String it is!
On Mon, May 13, 2013 at 7:47 PM, Christopher <[email protected]> wrote: > Well, encoding it might save space, but strings are nice and > human-readable, especially in the shell, and in the overall scheme of > things, a string probably isn't really that much larger on disk, > especially after compression. > > -- > Christopher L Tubbs II > http://gravatar.com/ctubbsii > > > On Mon, May 13, 2013 at 6:09 PM, Mike Hugo <[email protected]> wrote: > > I've been playing around with the LongCombiner on a table that's summing > up > > the counts of output of a MapReduce job, very similar to the WordCount > > example from the user manual. > > > > I started out encoding the values using LongCombiner.FIXED_LEN_ENCODER, > but > > have noticed that this can lead to some confusion later on downstream. > For > > example, a co-worker was scanning using the shell and was caught off > guard > > by the encoded values. Also, out of the box, the StatsCombiner example > > works using String values, not Long values so we built a custom piece to > > essentially do the same thing with Long values instead. > > > > It looks to me like most of the examples I've seen just store things are > > String values, rather than encoding them. What are the tradeoffs? > We're at > > a point where we could pretty easily switch things to just use strings - > it > > seems like that might make things more convenient from a maintenance > > perspective (human readable values) and would allow us to re-use some > > existing components (e.g. StatsCombiner). Any thoughts? > > > > Thanks, > > > > Mike >
