To address one specific question:

> Docs says it usues sun.misc.unsafe to convert physical rdd structure into
byte array at some point for optimized GC and memory. My question is why is
it only applicable to SQL/Dataframe and not RDD? RDD has types too!

A principal difference between RDDs and DataFrames/Datasets is that the
latter have a schema associated to them. This means that they support only
certain types (primitives, case classes and more) and that they are
uniform, whereas RDDs can contain any serializable object and must not
necessarily be uniform. These properties make it possible to generate very
efficient serialization and other optimizations that cannot be achieved
with plain RDDs.

Reply via email to