Sounds like a useful addition. I just checked to see if this could easily be done by adding another named OptionDescriber param, but it doesn't look like RowEncodingIterator (or WholeRowIterator) implements that. Making RowEncodingIterator implement OptionDescriber, and passing these options that way seems like the most backwards-compatible approach.
Or, did you have another implementation in mind? Feel free to open a JIRA issue and corresponding PR. -- Christopher L Tubbs II http://gravatar.com/ctubbsii On Tue, Mar 31, 2015 at 7:05 PM, Russ Weeks <[email protected]> wrote: > Hi, folks! > > How do you feel about adding a couple of parameters to RowEncodingIterator > to limit the number of keys and/or the total size of the values in the > "keys" and "values" lists? > > The WholeRowIterator is an awesome convenience but I've caused more than a > few OOM errors by applying it to rows that it shouldn't be applied to. It > would be nice to have a safeguard so that this mistake manifests as an > IOException instead of a dead tablet server. > > The failure case is actually really bad when I make this mistake in a MR > job because I think it kills my tablet servers one by one as YARN retries > the job. > > Of course, these would be optional parameters and the default would be to > not impose a limit, to preserve current behaviour. > > If this would be useful, I'm happy to put together a PR. > > -Russ
