[
https://issues.apache.org/jira/browse/PIG-14?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12546501
]
Utkarsh Srivastava commented on PIG-14:
---------------------------------------
Looks good, except for the slightly longer-term issue that we seem to be
pushing more and more hadoop specific stuff into our core codebase. It would be
nice if we could maintain a clean separation between our data model, and the
underlying execution platform.
Thus, my comment relates more to https://issues.apache.org/jira/browse/PIG-32
For example, what happens if we are executing in the local mode? The patch
happens to work alright because PigMapReduce.reporter is null. In general, the
correct approach seems to be that progress() is an abstract method provided by
the execution engine.
> large key cause pig reduce jobs to die
> --------------------------------------
>
> Key: PIG-14
> URL: https://issues.apache.org/jira/browse/PIG-14
> Project: Pig
> Issue Type: Bug
> Components: impl
> Reporter: Olga Natkovich
> Assignee: Olga Natkovich
> Attachments: heartbeat.patch
>
>
> The reducer sends a heartbeat to the task tracker every time it starts
> processing new key. The task tracker expects to
> get a message every 10 minutes. If processing of an individual key takes
> longer, which could be the case for your job,
> the task tracker would not get a heartbeat in time and would kill the task.
> The current patch is to add <property>
> <name>mapred.task.timeout</name>
> <value>0</value>
> <description>timeout value</description>
> </property>
> to the cluster's hadoop-site.xml. This results in disabling heartbeat
> functionality which might not be what we want
> long term.
> A more flexible approach is to periodically report from map and reduce job via
> http://lucene.apache.org/hadoop/api/org/apache/hadoop/mapred/Reporter.html#setStatus(java.lang.String)
> As a workaround for a UDF, call: PigMapReduce.reporter.progress() every
> 1000th time
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.