Alan Gates
Tue, 10 Jun 2008 16:20:08 -0700
All this said, the code that handles spilling bags and freeing memory does not work ideally yet (as you've seen from the discussions regarding the gc overhead bug) so pig sometimes dies when it shouldn't.
For a reference on spilling see src/org/apache/pig/data/DataBag.java and extending classes.
Alan. Mridul Muralidharan wrote:
Hi, How does pig handle really large tuples.Assuming after a group, the resulting alias has small subset of tuples (out of the many which were generated) which are really large in size. In excess of a gig as a ballpark figure (so that the tuple is spread across many dfs blocks).Does pig handle this case ? If yes how (refs/rtfm would be great too) ? Thanks, Mridul