Hi Dragos, You might be facing this issue - https://issues.apache.org/jira/browse/PIG-1815, it has been resolved in pig 0.8 branch after the official release. We are likely to release a new 0.8 patch (pending discussion) with the fixes. Does your pig jar have this fix ? If not , can you please try building with http://svn.apache.org/repos/asf/pig/branches/branch-0.8 and try again with the new jar?
On 2/18/11 12:26 PM, "Dragos Munteanu" <[email protected]> wrote: > Hi all, > > I have a Pig script that only runs if I turn on "-no_multiquery". > > My questions are: > - is it expected that Pig's multiquery execution would create enough of an > overhead that the execution should fail? It is not expected to fail. > - can someone explain (or point me to an explanation) of where the > multiquery overhead comes from? I'd really like to understand it In case of multi-query you end up doing more computation per task, so an issue such as one PIG-1815 might not be causing failures in the non multiquery case. Also PIG-1815 is caused by physical plan copies not being freed and multi-query physical plan will be larger. > - is there a better way to write the pig code to do that computation? Maybe > I can re-structure my computation, or configure my cluster differently? Or > am I stuck with a no_multiquery execution? If your query does not work with latest from 0.8 branch, please let us know. -Thejas
