rednaxelafx opened a new pull request #29602:
URL: https://github.com/apache/spark/pull/29602


   ### What changes were proposed in this pull request?
   
   Use `CodeGenerator.typeName()` instead of `Class.getCanonicalName()` in 
`CodegenContext.addReferenceObj()` for getting the runtime class name for an 
object.
   
   ### Why are the changes needed?
   
   https://github.com/apache/spark/pull/29439 fixed a bug in 
`CodegenContext.addReferenceObj()` for `Array[Byte]` (i.e. Spark SQL's 
`BinaryType`) objects, but unfortunately it introduced a regression for some 
nested Scala types.
   
   For example, for `implicitly[Ordering[UTF8String]]`, after that PR 
`CodegenContext.addReferenceObj()` would return `((null) references[0] /* ... 
*/)`. The actual type for `implicitly[Ordering[UTF8String]]` is 
`scala.math.LowPriorityOrderingImplicits$$anon$3` in Scala 2.12.10, and 
`Class.getCanonicalName()` returns `null` for that class.
   
   On the other hand, `Class.getName()` is safe to use for all non-array types, 
and Janino will happily accept the type name returned from `Class.getName()` 
for nested types. `CodeGenerator.typeName()` happens to do the right thing by 
correctly handling arrays and otherwise use `Class.getName()`. So it's a better 
alternative than `Class.getCanonicalName()`.
   
   Side note: rule of thumb for using Java reflection in Spark: it may be 
tempting to use `Class.getCanonicalName()`, but for functions that may need to 
handle Scala types, please avoid it due to potential issues with nested Scala 
types.
   Instead, use `Class.getName()` or utility functions in 
`org.apache.spark.util.Utils` (e.g. `Utils.getSimpleName()` or 
`Utils.getFormattedClassName()` etc).
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Added new unit test case for the regression case in `CodeGenerationSuite`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to