Re: UDAFs have an inefficiency problem

Erik Erlandson Fri, 05 Jul 2019 17:23:02 -0700

I submitted a PR for this:
https://github.com/apache/spark/pull/25024


On Wed, Mar 27, 2019 at 4:19 PM Erik Erlandson <eerla...@redhat.com> wrote:

> I describe some of the details here:
> https://issues.apache.org/jira/browse/SPARK-27296
>
> The short version of the story is that aggregating data structures (UDTs)
> used by UDAFs are serialized to a Row object, and de-serialized, for every
> row in a data frame.
> Cheers,
> Erik
>
>

Re: UDAFs have an inefficiency problem

Reply via email to