[ https://issues.apache.org/jira/browse/PIG-3898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Cheolsoo Park updated PIG-3898: ------------------------------- Attachment: PIG-3898-1-tez.patch PIG-3898-1-trunk.patch Thank you everyone for your comments. I am uploading my wip patches. The changes include- # Update initialPlanNotification() in PPNL as suggested. # Move non-MR specific code from MRScriptState to ScriptState. PIG-3419 moved too much code to MRScriptState, so I am moving back to ScriptState whatever is applicable to both MR and Tez. Regarding exposing OperatorPlan in the API, I agree that we should avoid it if possible. But since Ambrose and Lipstick use it heavily, it won't be easy to take it back at this point. Nevertheless, it's definitely worth to discuss. > Refactor PPNL for non-MR execution engine > ----------------------------------------- > > Key: PIG-3898 > URL: https://issues.apache.org/jira/browse/PIG-3898 > Project: Pig > Issue Type: Task > Reporter: Cheolsoo Park > Assignee: Cheolsoo Park > Fix For: 0.13.0 > > Attachments: PIG-3898-1-tez.patch, PIG-3898-1-trunk.patch > > > Currently, PPNL assumes the MR plan, and thus, it's not compatible with > non-MR execution engine. To support non-MR execution engines, I propose we > changed initialPlanNotification() method as follows- > {code:title=from} > public void initialPlanNotification(String scriptId, MROperPlan plan); > {code} > {code:title=to} > public void initialPlanNotification(String scriptId, OperatorPlan<?> plan); > {code} > Since MROperPlan and TezOperPlan are a subclass of OperatorPlan, this method > can take both plans. In addition, if we add a new execution engine in the > future, it won't break the interface again as long as we build the operator > plan as a subclass of OperatorPlan. > With this approach, applications such as Ambrose / Lipstick should be able to > dynamically cast OperatorPlan to a concrete subclass depending on the > ExecType. > One disadvantage is that this isn't backward compatible with Pig 0.12 and > older. But it only requires minor changes, and backward compatibility will be > broken one time only. > I also considered an alternative approach, for example, adding a new PPNL for > Tez. But this approach has two problems. > # Pig registers PPNL via the Main function, and right now, only one PPNL can > be registered. So having more than one PPNLs requires quite a few code > changes in Main, ScriptState, and so on. > # Multiple PPNL interfaces mean multiple PPNL implementations. This results > in more (duplicate) code in applications such as Ambrose / Lipstick. -- This message was sent by Atlassian JIRA (v6.2#6252)