I appreciate that. We should probably feature this prominently somewhere in the documentation.
2012/2/14 Alan Gates <[email protected]> > Originally Tuples were written to allow in place modifications. Lately > we've started doing things to tuples that would violate that, such as the > work Dmitriy's done to use Tuples of specific types as an optimization in > certain situations and the work we're doing in HCat to make tuple coming > from HCat a thin wrapper over HCatRecord which is again a thin wrapper over > a Hive SerDe. > > So, to actually answer your question, it's generally better to create a > new tuple. > > Alan. > > On Feb 14, 2012, at 11:14 AM, Jonathan Coveney wrote: > > > Thanks, Alan. Is this the same case for Tuples? For example, if I were to > > take Tuples from an input, append the value, then add that tuple to a new > > bag, is that safe? Or can Tuples be modified after the fact as well? > > > > 2012/2/14 Alan Gates <[email protected]> > > > >> No. Bags are written with the explicit assumption that once reading > >> begins, there will never be another write to the bag. This simplifies a > >> lot of the code in the bags as far as spilling. > >> > >> Alan. > >> > >> On Feb 13, 2012, at 6:19 PM, Jonathan Coveney wrote: > >> > >>> I feel like the answer is that it is not safe, but I'd like to make > sure. > >>> IE is the following ok, and if it is not, why not? > >>> > >>> public DataBag exec(Tuple input) throws IOException { > >>> DataBag bag = (DataBag)input.get(0); > >>> long index=0; > >>> for (Tuple tuple : bag) { > >>> tuple.append(index++); > >>> } > >>> return bag; > >>> } > >>> > >>> Appreciate the guidance. > >> > >> > >
