I also had this question when I was playing with DefaultTuple once. I removed that null-addition loop, but then it failed while setting an object at an specific index. There is a code in Pig which sets value at a specific index in the tuple, which will throw NPE if you dont first add null in the list. This is one of those oddities of java.
List<String> list = new ArrayList<String>(3); list.set(1,"mystring"); throws NPE, while following doesn't. List<String> list = new ArrayList<String>(3); for (int i =0; i < 3; i++) { list.add(null); } list.set(1,"mystring"); Since client code using DefaultTuple is permissible to set value at any index of tuple, we have no choice but to null-fill the list first. Hope it helps, Ashutosh On Sat, May 26, 2012 at 9:13 PM, Jonathan Coveney <jcove...@gmail.com>wrote: > -user > +dev > > Haha, I made this very same comment somewhere, and noticed the exact same > thing (I think I mention it in my SchemaTuple benchmarking). > > The reason is so that tuple.size() will return the right value. Also, the > expectation is that if you append, it goes to the end of all of the nulls, > not the first position. It's a little confusing, and yeah, it surprised me > too. > > You could definitely amortize the cost of creation over the sets that the > user does by keeping an index, but when I first saw it I decided that the > (slightly) increased memory footprint and the increase in code complexity > wasn't worth a very minimal increase. > > 2012/5/26 Prashant Kommireddi <prash1...@gmail.com> > > > I rambled across this while reviewing one of Jon's patches. Here is the > > code from DefaultTuple > > > > /** > > * Construct a tuple with a known number of fields. Package level so > > that callers cannot directly invoke it. > > * <br>Resulting tuple is filled pre-filled with null elements. Time > > complexity: O(N), after allocation > > * > > * @param size > > * Number of fields to allocate in the tuple. > > */ > > DefaultTuple(int size) { > > mFields = new ArrayList<Object>(size); > > for (int i = 0; i < size; i++) > > mFields.add(null); > > } > > > > > > Why are we walking through the list to add nulls? Wouldn't the initial > > creation of ArrayList suffice? > > mFields = new ArrayList<Object>(size) should be enough. > > > > Thanks, > > Prashant > > >