GitHub user fangshil opened a pull request:

    https://github.com/apache/spark/pull/21276

    [SPARK-24216][SQL] Spark TypedAggregateExpression uses getSimpleName that 
is not safe in scala

    ## What changes were proposed in this pull request?
    
    When we create a aggregator object within a function in scala and pass the 
aggregator to Spark Dataset's aggregation method, Spark's will initialize 
TypedAggregateExpression with the name field as 
aggregator.getClass.getSimpleName. However, getSimpleName is not safe in scala 
environment, for example, if the aggregator class full qualified name is 
"com.linkedin.spark.common.lib.SparkUtils$keyAgg$2$", the getSimpleName will 
throw exception "Malformed class name". This has been reported in scalatest 
https://github.com/scalatest/scalatest/pull/1044 and scala upstream jira 
https://issues.scala-lang.org/browse/SI-8110.
    
    To fix this issue, we follow the solution in 
https://github.com/scalatest/scalatest/pull/1044 to add safer version of 
getSimpleName as a util method, and TypedAggregateExpression will invoke this 
util method rather than getClass.getSimpleName.
    
    ## How was this patch tested?
    added unit test


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/fangshil/spark SPARK-24216

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21276.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21276
    
----
commit a43493b6299d4b6962f2b78c6956b29df51a78c9
Author: Fangshi Li <fli@...>
Date:   2018-04-27T19:54:49Z

    SPARK-24216: Spark TypedAggregateExpression uses getSimpleName that is not 
safe in scala

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to