[jira] [Updated] (PIG-4438) Can not work when in "limit after sort" situation in spark mode

liyunzhang_intel (JIRA) Mon, 16 Mar 2015 18:45:05 -0700

     [ 
https://issues.apache.org/jira/browse/PIG-4438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


liyunzhang_intel updated PIG-4438:
----------------------------------
    Attachment: PIG-4438_1.patch

PIG-4438_1.patch is the initial patch. Meet some problems when running the 
script in the bug description. Need more time to figure out. Error info is:
{code}
269 Caused by: org.apache.spark.SparkException: Job aborted due to stage 
failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 
in stage 0.0 (TID 0, localhost): java.lang.ClassCastException: java.lang.Byte 
cannot     be cast to java.util.Iterator
270         at 
org.apache.pig.backend.hadoop.executionengine.spark.converter.PackageConverter$PackageFunction.apply(PackageConverter.java:85)
271         at 
org.apache.pig.backend.hadoop.executionengine.spark.converter.PackageConverter$PackageFunction.apply(PackageConverter.java:48)
272         at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
273         at 
scala.collection.convert.Wrappers$IteratorWrapper.next(Wrappers.scala:30)
274         at 
org.apache.pig.backend.hadoop.executionengine.spark.converter.POOutputConsumerIterator.readNext(POOutputConsumerIterator.java:35)
275         at 
org.apache.pig.backend.hadoop.executionengine.spark.converter.POOutputConsumerIterator.hasNext(POOutputConsumerIterator.java:64)
276         at 
scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
277         at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
278         at 
scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:29)
279         at 
org.apache.pig.backend.hadoop.executionengine.spark.converter.POOutputConsumerIterator.readNext(POOutputConsumerIterator.java:30)
280         at 
org.apache.pig.backend.hadoop.executionengine.spark.converter.POOutputConsumerIterator.hasNext(POOutputConsumerIterator.java:64)
281         at 
scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
282         at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
283         at 
scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:29)
284         at 
org.apache.pig.backend.hadoop.executionengine.spark.converter.POOutputConsumerIterator.readNext(POOutputConsumerIterator.java:30)
285         at 
org.apache.pig.backend.hadoop.executionengine.spark.converter.POOutputConsumerIterator.hasNext(POOutputConsumerIterator.java:64)
286         at 
scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:41)
287         at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
288         at 
org.apache.spark.rdd.PairRDDFunctions$$anonfun$12.apply(PairRDDFunctions.scala:987)
289         at 
org.apache.spark.rdd.PairRDDFunctions$$anonfun$12.apply(PairRDDFunctions.scala:965)
290         at 
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
291         at org.apache.spark.scheduler.Task.run(Task.scala:56)
292         at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)
293         at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
294         at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
295         at java.lang.Thread.run(Thread.java:744)
{code}

> Can not work when in "limit after sort" situation in spark mode
> ---------------------------------------------------------------
>
>                 Key: PIG-4438
>                 URL: https://issues.apache.org/jira/browse/PIG-4438
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: liyunzhang_intel
>            Assignee: liyunzhang_intel
>             Fix For: spark-branch
>
>         Attachments: PIG-4438_1.patch
>
>
> when pig script executes "order" before "limit" in spark mode, the results 
> will be wrong.
> cat testlimit.txt
> 1     orange
> 3     coconut
> 5     grape
> 6     pear
> 2     apple
> 4     mango
> testlimit.pig:
> a = load './testlimit.txt' as (x:int, y:chararray);
> b = order a by x;
> c = limit b 1;
> store c into './testlimit.out';
> the result:
> 1     orange
> 2     apple
> 3     coconut
> 4     mango
> 5     grape
> 6     pear
> the correct result should be:
> 1     orange



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (PIG-4438) Can not work when in "limit after sort" situation in spark mode

Reply via email to