Hanbo Wang created SPARK-15418:
----------------------------------
Summary: SparkSQL does not support using a UDAF in a CREATE VIEW
clause
Key: SPARK-15418
URL: https://issues.apache.org/jira/browse/SPARK-15418
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 1.6.1
Reporter: Hanbo Wang
I am using AWS EMR + Spark 1.6.1 + Hive 1.0.0
I have this UDAF and have included it in the classpath of spark
https://github.com/scribd/hive-udaf-maxrow/blob/master/src/com/scribd/hive/udaf/GenericUDAFMaxRow.java
And registered it in spark by sqlContext.sql("CREATE TEMPORARY FUNCTION maxrow
AS 'some.cool.package.hive.udf.GenericUDAFMaxRow'")
However, when I call it in Spark in the following CREATE VIEW query
{{
CREATE VIEW VIEW_1 AS
SELECT
a.A,
a.B,
maxrow ( a.C,
a.D,
a.E,
a.F,
a.G,
a.H,
a.I
) as m
FROM
table_1 a
JOIN
table_2 b
ON
b.Z = a.D
AND b.Y = a.C
JOIN dummy_table
GROUP BY
a.A,
a.B
}}
It gave me the following error
{{
16/05/18 19:49:14 WARN RowResolver: Duplicate column info for a.A was
overwritten in RowResolver map: _col0: string by _col0: string
16/05/18 19:49:14 WARN RowResolver: Duplicate column info for a.B was
overwritten in RowResolver map: _col1: bigint by _col1: bigint
16/05/18 19:49:14 ERROR Driver: FAILED: SemanticException [Error 10002]: Line
16:32 Invalid column reference 'C'
org.apache.hadoop.hive.ql.parse.SemanticException: Line 16:32 Invalid column
reference 'C'
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genAllExprNodeDesc(SemanticAnalyzer.java:10643)
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:10591)
at
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genSelectPlan(SemanticAnalyzer.java:3656)
}}
Running the query without CREATE VIEW is fine.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]