Greg, Pig 8 tells you which job is responsible for which set of operators; you can save all the inputs to the map only job by inserting intermediate stores, and debug just the map-only job.
D On Wed, Jan 26, 2011 at 2:49 PM, Greg Langmead <[email protected]> wrote: > Pig 0.8 executes my script by running six jobs. One of them is identified > as > "MAP_ONLY" and it always fails, with the innermost error I can find either > saying "GC overhead limit exceeded" or "Java heap space". I suspect I have > a > piece that is too large. How can I get my hands on the actual data it was > processing, so I can ascertain the cause? The task log says "Input records > from tmp1872359169" can I view that data? > > Thanks, > > Greg Langmead | Senior Research Scientist | SDL Language Weaver | (t) +1 > 310 > 437 7300 > > </pre> > <BR style="font-size:4px;"> > <a href = "http://www.sdl.com/innovate"><img src=" > http://www.sdl.com/images/Innovate2011_emailsignature_final.png" alt=" > www.sdl.com" border="0"/></a> > <BR> > <font face="arial" size="2"><a href ="http://www.sdl.com/innovate" > style="color:005740; font-weight: bold">www.sdl.com/innovate</a></font> > <BR> > <BR> > <font face="arial" size="1" color="#736F6E"> > <b>SDL PLC confidential, all rights reserved.</b> > If you are not the intended recipient of this mail SDL requests and > requires that you delete it without acting upon or copying any of its > contents, and we further request that you advise us.<BR> > SDL PLC is a public limited company registered in England and Wales. > Registered number: 02675207.<BR> > Registered address: Globe House, Clivemont Road, Maidenhead, Berkshire SL6 > 7DY, UK. > </font> >
