[ 
https://issues.apache.org/jira/browse/SPARK-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13978605#comment-13978605
 ] 

Michael Armbrust edited comment on SPARK-1591 at 4/23/14 6:47 PM:
------------------------------------------------------------------

Hi Ken,

I'd be curious if your UDTF works with Spark SQL, which is a from scratch 
rewrite of Shark that will be included in Spark 1.0.  My guess is we are going 
to run into the same problem, but if we do I'd like to fix it.

Michael


was (Author: marmbrus):
Hi Ken,

I'd be curious if your UDTF works with Spark SQL, which is a from scratch 
rewrite of Shark that will be included in Spark 1.0.  My guess is we are going 
to run into the same problem, but if do I'd like to fix it.

Michael

> scala.MatchError executing custom UDTF
> --------------------------------------
>
>                 Key: SPARK-1591
>                 URL: https://issues.apache.org/jira/browse/SPARK-1591
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 0.9.1
>         Environment: CentOS 5, Hortonworks 1.3.2, Hadoop 1.2.0, Hive 0.11.0, 
> Spark 0.9.1, Shark 0.9.1, sharkserver2, beeline
>            Reporter: Ken Ellinwood
>            Priority: Minor
>
> My custom UDTF fails to execute in Shark even though it runs fine in Hive.
> scala.MatchError: [orange, 1, Black, 419] (of class java.util.ArrayList)
>     at scala.runtime.ScalaRunTime$.array_clone(ScalaRunTime.scala:118)
>     at shark.execution.UDTFCollector.collect(UDTFOperator.scala:92)
>     at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTF.forward(GenericUDTF.java:91)
>     at 
> com.mycompany.warehouse.hive.HiveUdtfColorTreeTable.process(HiveUdtfColorTreeTable.java:98)
>     at shark.execution.UDTFOperator.explode(UDTFOperator.scala:79)
>     at 
> shark.execution.LateralViewJoinOperator$$anonfun$processPartition$1.apply(LateralViewJoinOperator.scala:141)
> The code at UDTFOperator.scala, line 92 is making two assumptions which are 
> not true in my case.  First, it claims to need to clone the row object.  
> Second, it assumes all rows objects are arrays.  In my case the row is 
> represented by ArrayList and does not need to be cloned because my UDTF 
> creates a new one for each row already.   The clone operation fails because 
> my row is not an array.
> I changed my implementation to use an array, but we have a non-trivial number 
> of custom UDFs that all work with Hive and I think they should work in Shark 
> without modification.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to