[ 
https://issues.apache.org/jira/browse/PIG-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14274793#comment-14274793
 ] 

liyunzhang_intel commented on PIG-4362:
---------------------------------------

i modified some code in SparkLauncher.java 121L
{code}
-121,7 +129,17 @@ public class SparkLauncher extends Launcher {
         }
 
         startSparkIfNeeded();
+        String currentDirectoryPath = 
Paths.get(".").toAbsolutePath().normalize().toString() + "/";
+        startSparkJob(pigContext,currentDirectoryPath);
+        LinkedList<POStore> stores = PlanHelper.getPhysicalOperators(
+                physicalPlan, POStore.class);
+        POStore firstStore = stores.getFirst();
+        if( firstStore != null ){
+            MapRedUtil.setupStreamingDirsConfSingle(firstStore, pigContext, c);
+        }
{code}

This is because we need use MapRedUtil.setupStreamingDirsConfSingle(firstStore, 
pigContext, c) to initialize 
org.apache.pig.backend.hadoop.streaming.HadoopExecutableManager#scriptOutputDir 
otherwise TestStreaming#testOutputShipSpecs fails.   
TestStreaming#testOutputShipSpecs is a script like following
{code}
define A `stream.pl` output('output1', 'output2' using MyDeserializer);
Y = stream X through A;
{code}
In this example, output1 will be loaded into Y while the second output will be 
stored as a subdirectory with name output2 in the output directory specified 
for pig script. If we do not init 
org.apache.pig.backend.hadoop.streaming.HadoopExecutableManager#scriptOutputDir,
 the second output will not be generated.


> Make ship work with spark
> -------------------------
>
>                 Key: PIG-4362
>                 URL: https://issues.apache.org/jira/browse/PIG-4362
>             Project: Pig
>          Issue Type: Bug
>          Components: spark
>            Reporter: liyunzhang_intel
>            Assignee: liyunzhang_intel
>         Attachments: PIG-4362.patch, PIG-4362_1.patch, PIG-4362_2.patch, 
> PIG-4362_3.patch, PIG-4362_4.patch, test_harnesss_1420652019
>
>
> Related e2e test: ComputeSpec_1~ComputeSpec_8.
> found following error when run ComputeSpec_4.pig
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> 299 SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 300 ===== Task Information Header =====
> 301 Command: perl PigStreaming.pl 
> (stdin-org.apache.pig.builtin.PigStreaming/stdout-org.apache.pig.builtin.PigStreaming)
> 302 Start time: Mon Dec 29 10:48:43 CST 2014
> 303 =====          * * *          =====
> 304 Can't open perl script "PigStreaming.pl": No such file or directory
> 305 Details at logfile: 
> /home/zly/prj/oss/pig/test/e2e/pig/testdist/out/pigtest/root/root-1419821303-streaming.conf/ComputeSpec_4.log
> 306 ERROR TestDriver::runTestGroup at : 729 Failed to run test ComputeSpec_4 
> <Failed running 
> ./out/pigtest/root/root-1419821303-streaming.conf/ComputeSpec_4.    pig



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to