erikerlandson commented on issue #25024: [SPARK-27296][SQL] User Defined 
Aggregators that do not ser/de on each input row
URL: https://github.com/apache/spark/pull/25024#issuecomment-509416668
 
 
   To elaborate on the 'raw object reference' above, what I specifically did 
was try using a DataType like `ObjectType(classOf[TDigest])` in the mutable agg 
buffer schema.
   
   That immediately fails here:
   
https://github.com/apache/spark/blob/3139d642fac0e6ae6b9edd1b4c2912c3a69f71e5/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/RowEncoder.scala#L215
   
   For fun I tried defaulting that to "identity" for `ObjectType`, and it gets 
farther but then it fails way down in code generation:
   ```
   ERROR CodeGenerator: failed to compile: 
org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 37, 
Column 24: No applicable constructor/method found for actual parameters "int, 
org.isarnproject.sketches.TDigest"
   ```
   
   So that is a flavor of catalyst's problem with handling anything outside its 
defined universe of data types.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to