[ 
https://issues.apache.org/jira/browse/PIG-161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shravan Matthur Narayanamurthy updated PIG-161:
-----------------------------------------------

    Attachment: incr5.patch

Includes mainly the MRCompiler and its subordinate classes included in the 
mapReduceLayer package. Also included are tests for the same. In order for 
better testablility, I have included some dummy operators like 
POGlobalRearrange, POCast as the compiler does not care about the operator's 
functionality. Also included is the POSplit Map Reduce operator. It is 
essentially dummy as it does not do any work. The compiler translates it into 
store-load and assumes that the logical to physical translation would ensure 
that the relevant filters are used as outputs of the Split.

Also in the patch is an implementation of the POUnion operator which works for 
both MapReduce and Local backends. Ialso have tests for the same.

Another class included is a PlanPrinter which does tree-like pretty printing of 
the plan. I am attaching another file which has all the test cases I have ran 
for the MRCompiler which has about 14 test cases. It has the PlanPrinter 
representation of the plan compiled and the compiled plan. Please check if the 
conversion taking place is apt.

The MRCompiler doesn't support the POSort operator. After much thought I 
decided to submit it without it because the POSort MR needs POUserFunc and 
POSort local. So decided to wait for them to be checked in.

This would not be a major change and would not affect existing code.

Another thing is that the MRCompiler uses GenPhyOp class because of which I 
have include some test folder classes into the compilation of the src folder 
classes. As an artifact of the changes in GenPhyOp, which calls the 
PigContext.connect() and hence needs to use the MiniCluster, all test that use 
it will take much longer to execute. The test time has shot up to 1 min 46 sec. 
Is there a way to just create the MiniCluster once rather than doing it in each 
TestCase?

Pretty long patch and an important one too. So please review it thoroughly. 
Awaiting comments.

> Rework physical plan
> --------------------
>
>                 Key: PIG-161
>                 URL: https://issues.apache.org/jira/browse/PIG-161
>             Project: Pig
>          Issue Type: Sub-task
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: arithmeticOperators.patch, incr2.patch, incr3.patch, 
> incr4.patch, incr5.patch, Phy_AbsClass.patch, pogenerate.patch, 
> pogenerate.patch, pogenerate.patch, posort.patch
>
>
> This bug tracks work to rework all of the physical operators as described in 
> http://wiki.apache.org/pig/PigTypesFunctionalSpec

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to