[
https://issues.apache.org/jira/browse/SPARK-28710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16906180#comment-16906180
]
Sandeep Katta commented on SPARK-28710:
---------------------------------------
cc [~dongjoon] [~hyukjin.kwon] requires your suggestion on this
As you can see from the logs after replace command the UDF className is updated
but the resource file is not updated. So spark is looking the UDF class in the
wrong path.
*Observation 1*: Temporary function does not have this problem as it is handled
by spark logic
*Observation 2*: For permanent function Spark calls the Hive to alter function
, as of now Hive only alter name, owner, class name, type but not resource URI
[Spark
code|[https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala#L86]]
[Hive
Code|[https://github.com/apache/hive/blob/master/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/ObjectStore.java#L9914]]
As per [Hive
Documentation|[https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Create/Drop/ReloadFunction]]
it does not supports create or replace command for function
So what should spark do ?
solution 1: throw UnSupported Error for permanent function
solution 2: instead of alter funtion, do drop and create
> [UDF] create or replace permanent function does not clear the jar in class
> path
> -------------------------------------------------------------------------------
>
> Key: SPARK-28710
> URL: https://issues.apache.org/jira/browse/SPARK-28710
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 3.0.0
> Reporter: ABHISHEK KUMAR GUPTA
> Priority: Major
>
> 0: jdbc:hive2://10.18.19.208:23040/default> create function addDoubles AS
> 'com.huawei.bigdata.hive.example.udf.AddDoublesUDF' using jar
> 'hdfs://hacluster/user/AddDoublesUDF.jar';
> +---------+
> | Result |
> +---------+
> +---------+
> No rows selected (0.216 seconds)
> 0: jdbc:hive2://10.18.19.208:23040/default> create or replace function
> addDoubles AS 'com.huawei.bigdata.hive.example.udf.multiply' using jar
> 'hdfs://hacluster/user/Multiply.jar';
> +---------+
> | Result |
> +---------+
> +---------+
> No rows selected (0.292 seconds)
> 0: jdbc:hive2://10.18.19.208:23040/default> select addDoubles(3,3);
> INFO : Added
> [/tmp/8f3d7e87-469e-45e9-b5d1-7c714c5e0183_resources/AddDoublesUDF.jar] to
> class path
> INFO : Added resources: [hdfs://hacluster/user/AddDoublesUDF.jar]
> INFO : Added
> [/tmp/8f3d7e87-469e-45e9-b5d1-7c714c5e0183_resources/AddDoublesUDF.jar] to
> class path
> INFO : Added resources: [hdfs://hacluster/user/AddDoublesUDF.jar]
> Error: org.apache.spark.sql.AnalysisException: Can not load class
> 'com.huawei.bigdata.hive.example.udf.multiply' when registering the function
> 'default.addDoubles', please make sure it is on the classpath; line 1 pos 7
> (state=,code=0)
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]