Fangshi Li created SPARK-24216:
----------------------------------
Summary: Spark TypedAggregateExpression uses getSimpleName that is
not safe in scala
Key: SPARK-24216
URL: https://issues.apache.org/jira/browse/SPARK-24216
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 2.3.0, 2.3.1
Reporter: Fangshi Li
When we create a aggregator object within a function in scala and pass the
aggregator to Spark Dataset's aggregation method, Spark's will initialize
TypedAggregateExpression with the name field as
aggregator.getClass.getSimpleName. However, getSimpleName is not safe in scala
environment, for example, if the aggregator class full qualified name is
"com.linkedin.spark.common.lib.SparkUtils$keyAgg$2$", the getSimpleName will
throw exception "Malformed class name". This has been reported in scalatest
https://github.com/scalatest/scalatest/pull/1044 and scala upstream jira
https://issues.scala-lang.org/browse/SI-8110.
To fix this issue, we follow the solution in
https://github.com/scalatest/scalatest/pull/1044 to add safer version of
getSimpleName as a util method, and TypedAggregateExpression will invoke this
util method rather than getClass.getSimpleName.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]