[
https://issues.apache.org/jira/browse/SPARK-24935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16769692#comment-16769692
]
Parth Gandhi commented on SPARK-24935:
--------------------------------------
Have created a new pull request for this issue here:
https://github.com/apache/spark/pull/23778.
> Problem with Executing Hive UDF's from Spark 2.2 Onwards
> --------------------------------------------------------
>
> Key: SPARK-24935
> URL: https://issues.apache.org/jira/browse/SPARK-24935
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.2.0, 2.3.1
> Reporter: Parth Gandhi
> Priority: Major
>
> A user of sketches library(https://github.com/DataSketches/sketches-hive)
> reported an issue with HLL Sketch Hive UDAF that seems to be a bug in Spark
> or Hive. Their code runs fine in 2.1 but has an issue from 2.2 onwards. For
> more details on the issue, you can refer to the discussion in the
> sketches-user list:
> [https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/sketches-user/GmH4-OlHP9g/MW-J7Hg4BwAJ]
>
> On further debugging, we figured out that from 2.2 onwards, Spark hive UDAF
> provides support for partial aggregation, and has removed the functionality
> that supported complete mode aggregation(Refer
> https://issues.apache.org/jira/browse/SPARK-19060 and
> https://issues.apache.org/jira/browse/SPARK-18186). Thus, instead of
> expecting update method to be called, merge method is called here
> ([https://github.com/DataSketches/sketches-hive/blob/master/src/main/java/com/yahoo/sketches/hive/hll/SketchEvaluator.java#L56)]
> which throws the exception as described in the forums above.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]