[
https://issues.apache.org/jira/browse/PIG-3898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Work on PIG-3898 started by Cheolsoo Park.
> Refactor PPNL for non-MR execution engine
> -----------------------------------------
>
> Key: PIG-3898
> URL: https://issues.apache.org/jira/browse/PIG-3898
> Project: Pig
> Issue Type: Task
> Reporter: Cheolsoo Park
> Assignee: Cheolsoo Park
> Fix For: 0.13.0
>
> Attachments: PIG-3898-1-tez.patch, PIG-3898-1-trunk.patch
>
>
> Currently, PPNL assumes the MR plan, and thus, it's not compatible with
> non-MR execution engine. To support non-MR execution engines, I propose we
> changed initialPlanNotification() method as follows-
> {code:title=from}
> public void initialPlanNotification(String scriptId, MROperPlan plan);
> {code}
> {code:title=to}
> public void initialPlanNotification(String scriptId, OperatorPlan<?> plan);
> {code}
> Since MROperPlan and TezOperPlan are a subclass of OperatorPlan, this method
> can take both plans. In addition, if we add a new execution engine in the
> future, it won't break the interface again as long as we build the operator
> plan as a subclass of OperatorPlan.
> With this approach, applications such as Ambrose / Lipstick should be able to
> dynamically cast OperatorPlan to a concrete subclass depending on the
> ExecType.
> One disadvantage is that this isn't backward compatible with Pig 0.12 and
> older. But it only requires minor changes, and backward compatibility will be
> broken one time only.
> I also considered an alternative approach, for example, adding a new PPNL for
> Tez. But this approach has two problems.
> # Pig registers PPNL via the Main function, and right now, only one PPNL can
> be registered. So having more than one PPNLs requires quite a few code
> changes in Main, ScriptState, and so on.
> # Multiple PPNL interfaces mean multiple PPNL implementations. This results
> in more (duplicate) code in applications such as Ambrose / Lipstick.
--
This message was sent by Atlassian JIRA
(v6.2#6252)