I am recently hacking into the SparkSQL and trying to add some new udts and
functions, as well as some new Expression classes. I run into the problem of
the return type of nullSafeEval method. In one of the new Expression
classes, I want to return an array of my udt, and my code is like `return
Thanks for your reply!
Here's my *understanding*:
basic types that ScalaReflection understands are encoded into tungsten
binary format, while UDTs are encoded into GenericInternalRow, which stores
the JVM objects in an Array[Any] under the hood, and thus lose those memory
footprint efficiency and
I'm recently reading the source code of the SparkSQL project, and found some
interesting databricks blogs about the tungsten project. I've roughly read
through the encoder and unsafe representation part of the tungsten
project(haven't read the algorithm part such as cache friendly hashmap