[ 
https://issues.apache.org/jira/browse/SPARK-1591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13978605#comment-13978605
 ] 

Michael Armbrust commented on SPARK-1591:
-----------------------------------------

Hi Ken,

I'd be curious if your UDTF works with Spark SQL, which is a from scratch 
rewrite of Shark that will be included in Spark 1.0.  My guess is we are going 
to run into the same problem, but if do I'd like to fix it.

Michael

> scala.MatchError executing custom UDTF
> --------------------------------------
>
>                 Key: SPARK-1591
>                 URL: https://issues.apache.org/jira/browse/SPARK-1591
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 0.9.1
>         Environment: CentOS 5, Hortonworks 1.3.2, Hadoop 1.2.0, Hive 0.11.0, 
> Spark 0.9.1, Shark 0.9.1, sharkserver2, beeline
>            Reporter: Ken Ellinwood
>            Priority: Minor
>
> My custom UDTF fails to execute in Shark even though it runs fine in Hive.
> scala.MatchError: [orange, 1, Black, 419] (of class java.util.ArrayList)
>     at scala.runtime.ScalaRunTime$.array_clone(ScalaRunTime.scala:118)
>     at shark.execution.UDTFCollector.collect(UDTFOperator.scala:92)
>     at 
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTF.forward(GenericUDTF.java:91)
>     at 
> com.mycompany.warehouse.hive.HiveUdtfColorTreeTable.process(HiveUdtfColorTreeTable.java:98)
>     at shark.execution.UDTFOperator.explode(UDTFOperator.scala:79)
>     at 
> shark.execution.LateralViewJoinOperator$$anonfun$processPartition$1.apply(LateralViewJoinOperator.scala:141)
> The code at UDTFOperator.scala, line 92 is making two assumptions which are 
> not true in my case.  First, it claims to need to clone the row object.  
> Second, it assumes all rows objects are arrays.  In my case the row is 
> represented by ArrayList and does not need to be cloned because my UDTF 
> creates a new one for each row already.   The clone operation fails because 
> my row is not an array.
> I changed my implementation to use an array, but we have a non-trivial number 
> of custom UDFs that all work with Hive and I think they should work in Shark 
> without modification.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to