[jira] [Commented] (HIVE-7799) TRANSFORM failed in transform_ppr1.q[Spark Branch]

Venki Korukanti (JIRA) Fri, 22 Aug 2014 13:50:33 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-7799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14107462#comment-14107462
 ]


Venki Korukanti commented on HIVE-7799:
---------------------------------------

ScriptOperator is spawning a separate thread which is adding the records to 
collector. Iterator thread is trying to read from RowContainer while 
ScriptOperator spawned thread is adding the records. There may be other 
operators that may spawn threads for processing. Looks like we need a 
synchronized queue with persistence support. 

> TRANSFORM failed in transform_ppr1.q[Spark Branch]
> --------------------------------------------------
>
>                 Key: HIVE-7799
>                 URL: https://issues.apache.org/jira/browse/HIVE-7799
>             Project: Hive
>          Issue Type: Bug
>          Components: Spark
>            Reporter: Chengxiang Li
>            Assignee: Chengxiang Li
>              Labels: Spark-M1
>         Attachments: HIVE-7799.1-spark.patch, HIVE-7799.2-spark.patch
>
>
> Here is the exception:
> {noformat}
> 2014-08-20 01:14:36,594 ERROR executor.Executor (Logging.scala:logError(96)) 
> - Exception in task 0.0 in stage 1.0 (TID 0)
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.hive.ql.exec.spark.HiveKVResultCache.next(HiveKVResultCache.java:113)
>         at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.next(HiveBaseFunctionResultList.java:124)
>         at 
> org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList$ResultIterator.next(HiveBaseFunctionResultList.java:82)
>         at 
> scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:42)
>         at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>         at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>         at 
> org.apache.spark.shuffle.hash.HashShuffleWriter.write(HashShuffleWriter.scala:65)
>         at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>         at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>         at org.apache.spark.scheduler.Task.run(Task.scala:54)
>         at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:199)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:722)
> {noformat}
> Basically, the cause is that RowContainer is misused(it's not allowed to 
> write once someone read row from it), i'm trying to figure out whether it's a 
> hive issue or just in hive on spark mode.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-7799) TRANSFORM failed in transform_ppr1.q[Spark Branch]

Reply via email to