[ 
https://issues.apache.org/jira/browse/PIG-4374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292810#comment-14292810
 ] 

liyunzhang_intel commented on PIG-4374:
---------------------------------------

[~mohitsabharwal],
Following code is 
org.apache.pig.tools.pigstats.mapreduce.SimplePigStats.JobGraphBuilder which is 
similar with your SparkPigStats. I think you need SparkPlan in SparkPigStats.
{code}
private class JobGraphBuilder extends MROpPlanVisitor {
 
        public JobGraphBuilder(MROperPlan plan) {
            super(plan, new DependencyOrderWalker<MapReduceOper, MROperPlan>(
                    plan));
            jobPlan = new JobGraph();
            mroJobMap = new HashMap<MapReduceOper, MRJobStats>();
        }
 
        @Override
        public void visitMROp(MapReduceOper mr) throws VisitorException {
            MRJobStats js = new MRJobStats(
                    mr.getOperatorKey().toString(), jobPlan);
            jobPlan.add(js);
            List<MapReduceOper> preds = getPlan().getPredecessors(mr);
            if (preds != null) {
                for (MapReduceOper pred : preds) {
                    MRJobStats jpred = mroJobMap.get(pred);
                    if (!jobPlan.isConnected(jpred, js)) {
                        jobPlan.connect(jpred, js);
                    }
                }
            }
            mroJobMap.put(mr, js);
        }
    }
{code}
We have other reasons for needing SparkPlan besides passing unit test:  1. 
SparkLauncher now changes a physical plan to MRPlan to extract some info to 
physicalOper. If remove this, more unit tests fail(see PIG-4364), so my patch 
is to change a physical plan to SparkPlan.  

> Add SparkPlan in spark package
> ------------------------------
>
>                 Key: PIG-4374
>                 URL: https://issues.apache.org/jira/browse/PIG-4374
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: liyunzhang_intel
>            Assignee: liyunzhang_intel
>         Attachments: PIG-4374_1.patch, Pig-spark #76 [Jenkins .png, 
> jenkins_PIG-4374_1_patch.png.png
>
>
> in current code, there are following class in mapreduce and tez package.
> mapreduce:
> MRCompiler
> MROperPlan  MROpPlanVisitor
> MapReduceOper
> tez:
> TezCompiler
> TezOperPlan  TezOpPlanVisitor
> TezOperator
> Following class needs to be added in spark package:
> SparkCompiler
> SparkOperPlan  SparkOpPlanVisitor
> SparkOperator
> Current code need to be refactored because of above adding classes. Some unit 
> tests like TestStoreInstances fail because of it.
> following are the error of unit test TestStoreInstances:
> Error Message
> num jobs expected:<1> but was:<0>
> Stacktrace
> junit.framework.AssertionFailedError: num jobs expected:<1> but was:<0>
>       at 
> org.apache.pig.test.TestStoreInstances.testBackendStoreCommunication(TestStoreInstances.java:122)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to