Akihiro Matsukawa created PIG-4204:
--------------------------------------
Summary: UDFContext needs concept of Hadoop 2 Application Master
Key: PIG-4204
URL: https://issues.apache.org/jira/browse/PIG-4204
Project: Pig
Issue Type: Improvement
Reporter: Akihiro Matsukawa
Priority: Minor
Due to the fact that Pig's UDFs are instantiated both in the client and in the
backend, many UDFs rely on the UDFContext.isFrontend() method to determine
their behavior.
This distinction worked fine in Hadoop 1, but in Hadoop 2 now the UDF can be
instantiate in a third environment, the Application Master. This is a sort of
in-between environment where we are neither on the client nor a worker node.
For example, in Hadoop 2.4, split computation was pushed to the AM:
MAPREDUCE-207. If the loader needs to do some initialization depending on
UDFContext.isFrontend, its behavior on the AM is undefined (currently, it
performs the backend action, while the AM might not have the full information a
task node would, causing breakage).
Unfortunately I've found no easy way to check that we are in an AM task rather
than a map/reduce task, but I'll file this ticket here if any others have
insights.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)