[ https://issues.apache.org/jira/browse/PIG-4168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14145775#comment-14145775 ]
liyunzhang_intel commented on PIG-4168: --------------------------------------- Hi [~rohini] very thanks for your comment! I have updated the patch(PIG-4168_1.patch) and modified somewhere you pointed out: 1.*Previous*: New ExecTypes are pluggable using ServiceLoader. Please do not add them to ExecType class. *In this patch*: {code} TestSpark#setUp public void setUp() throws Exception { pigServer = new PigServer(new SparkExecType(), cluster.getProperties()); } {code} 2.*Previous*: {code:xml}<copy file="${basedir}/test/core-site.xml" tofile="${test.build.classes}/core-site.xml"/>{code} Why do you have to create an empty core-site.xml and copy to build dir? *In this patch*: I create class SparkMiniCluster, now it generates build/classes/hadoop-site.xml by code and not generate core-site.xml by build.xml. This file is needed because of the check in HExecutionEngine#getExecConf. {code} SparkMiniCluster#setupMiniDfsAndMrClusters private static final File CONF_DIR = new File("build/classes"); private static final File CONF_FILE = new File(CONF_DIR, "hadoop-site.xml"); @Override protected void setupMiniDfsAndMrClusters() { try { CONF_DIR.mkdirs(); if (CONF_FILE.exists()) { CONF_FILE.delete(); } m_conf = new Configuration(); m_conf.set("io.sort.mb", "1"); m_conf.writeXml(new FileOutputStream(CONF_FILE)); } catch (IOException e) { throw new RuntimeException(e); } } {code} {code} HExecutionEngine#getExecConf public JobConf getExecConf(Properties properties) throws ExecException { JobConf jc = null; // Check existence of user provided configs String isHadoopConfigsOverriden = properties.getProperty("pig.use.overriden.hadoop.configs"); if (isHadoopConfigsOverriden != null && isHadoopConfigsOverriden.equals("true")) { jc = new JobConf(ConfigurationUtil.toConfiguration(properties)); } else { // Check existence of hadoop-site.xml or core-site.xml in // classpath if user provided confs are not being used Configuration testConf = new Configuration(); ClassLoader cl = testConf.getClassLoader(); URL hadoop_site = cl.getResource(HADOOP_SITE); URL core_site = cl.getResource(CORE_SITE); if (hadoop_site == null && core_site == null) { throw new ExecException( "Cannot find hadoop configurations in classpath " + "(neither hadoop-site.xml nor core-site.xml was found in the classpath)." + " If you plan to use local mode, please put -x local option in command line", 4010); } jc = new JobConf(); } jc.addResource("pig-cluster-hadoop-site.xml"); jc.addResource(YARN_SITE); return jc; } {code} > Initial implementation of unit tests for Pig on Spark > ----------------------------------------------------- > > Key: PIG-4168 > URL: https://issues.apache.org/jira/browse/PIG-4168 > Project: Pig > Issue Type: Sub-task > Components: spark > Reporter: Praveen Rachabattuni > Assignee: liyunzhang_intel > Attachments: PIG-4168.patch > > > 1.ant clean jar; pig-0.14.0-SNAPSHOT-core-h1.jar will be generated by the > command > 2.export SPARK_PIG_JAR=$PIG_HOME/pig-0.14.0-SNAPSHOT-core-h1.jar > 3.build hadoop1 and spark env.spark run in local mode > jps: > 11647 Master #spark master runs > 6457 DataNode #hadoop datanode runs > 22374 Jps > 11705 Worker# spark worker runs > 27009 JobTracker #hadoop jobtracker runs > 26602 NameNode #hadoop namenode runs > 29486 org.eclipse.equinox.launcher_1.3.0.v20120522-1813.jar > 19692 Main > > 4.ant test-spark -- This message was sent by Atlassian JIRA (v6.3.4#6332)