Hi Michael, Is there a section in the spark documentation demonstrate how to serialize arbitrary objects in Dataframe? The last time I did was using some User Defined Type (copy from VectorUDT).
Best Regards, Jerry On Tue, Feb 2, 2016 at 8:46 PM, Michael Armbrust <mich...@databricks.com> wrote: > A principal difference between RDDs and DataFrames/Datasets is that the >> latter have a schema associated to them. This means that they support only >> certain types (primitives, case classes and more) and that they are >> uniform, whereas RDDs can contain any serializable object and must not >> necessarily be uniform. These properties make it possible to generate very >> efficient serialization and other optimizations that cannot be achieved >> with plain RDDs. >> > > You can use Encoder.kryo() as well to serialize arbitrary objects, just > like with RDDs. >