Hi all,

There's been longstanding demand for statically typed Datasets of Avro.
Functionality from the now-deprecated Databricks Spark-Avro project was
folded into Spark, but can still only provide DataFrames over Avro data. As
is discussed in the PR below, there are still drawbacks from not having
fully, statically typed Datasets of Avro.

There's an open PR adding a first-class Encoder for statically typed
Datasets of Avro:

https://github.com/apache/spark/pull/22878 :
https://issues.apache.org/jira/browse/SPARK-25789 (originally in
Databricks/spark-avro, https://github.com/databricks/spark-avro/pull/217 :
https://github.com/databricks/spark-avro/issues/169)

We've tested the content of this PR widely over complex, deeply nested,
Avro structures. It seems ready for a last review and nearly ready for
merger.

Alek Eskilson
github : bdrillard

Reply via email to