Pig 0.8 executes my script by running six jobs. One of them is identified as "MAP_ONLY" and it always fails, with the innermost error I can find either saying "GC overhead limit exceeded" or "Java heap space". I suspect I have a piece that is too large. How can I get my hands on the actual data it was processing, so I can ascertain the cause? The task log says "Input records from tmp1872359169" can I view that data?
Thanks, Greg Langmead | Senior Research Scientist | SDL Language Weaver | (t) +1 310 437 7300 </pre> <BR style="font-size:4px;"> <a href = "http://www.sdl.com/innovate"><img src="http://www.sdl.com/images/Innovate2011_emailsignature_final.png" alt="www.sdl.com" border="0"/></a> <BR> <font face="arial" size="2"><a href ="http://www.sdl.com/innovate" style="color:005740; font-weight: bold">www.sdl.com/innovate</a></font> <BR> <BR> <font face="arial" size="1" color="#736F6E"> <b>SDL PLC confidential, all rights reserved.</b> If you are not the intended recipient of this mail SDL requests and requires that you delete it without acting upon or copying any of its contents, and we further request that you advise us.<BR> SDL PLC is a public limited company registered in England and Wales. Registered number: 02675207.<BR> Registered address: Globe House, Clivemont Road, Maidenhead, Berkshire SL6 7DY, UK. </font>
