[SQL][CodeGen] Is there a way to set break point and debug the generated code?

2017-01-10 Thread dragonly
I am recently hacking into the SparkSQL and trying to add some new udts and functions, as well as some new Expression classes. I run into the problem of the return type of nullSafeEval method. In one of the new Expression classes, I want to return an array of my udt, and my code is like `return

Re: What is mainly different from a UDT and a spark internal type that ExpressionEncoder recognized?

2016-12-27 Thread dragonly
Thanks for your reply! Here's my *understanding*: basic types that ScalaReflection understands are encoded into tungsten binary format, while UDTs are encoded into GenericInternalRow, which stores the JVM objects in an Array[Any] under the hood, and thus lose those memory footprint efficiency and

What is mainly different from a UDT and a spark internal type that ExpressionEncoder recognized?

2016-12-27 Thread dragonly
I'm recently reading the source code of the SparkSQL project, and found some interesting databricks blogs about the tungsten project. I've roughly read through the encoder and unsafe representation part of the tungsten project(haven't read the algorithm part such as cache friendly hashmap