The transforming iterator has an option for this. Would be nice to use the same method, option names, and have the same behavior.
http://accumulo.apache.org/1.5/apidocs/org/apache/accumulo/core/iterators/user/TransformingIterator.html#setMaxBufferSize%28org.apache.accumulo.core.client.IteratorSetting,%20long%29 On Tue, Mar 31, 2015 at 7:05 PM, Russ Weeks <[email protected]> wrote: > Hi, folks! > > How do you feel about adding a couple of parameters to RowEncodingIterator > to limit the number of keys and/or the total size of the values in the > "keys" and "values" lists? > > The WholeRowIterator is an awesome convenience but I've caused more than a > few OOM errors by applying it to rows that it shouldn't be applied to. It > would be nice to have a safeguard so that this mistake manifests as an > IOException instead of a dead tablet server. > > The failure case is actually really bad when I make this mistake in a MR > job because I think it kills my tablet servers one by one as YARN retries > the job. > > Of course, these would be optional parameters and the default would be to > not impose a limit, to preserve current behaviour. > > If this would be useful, I'm happy to put together a PR. > > -Russ >
