thanks for the feedback. I would remove it from all operations (incl compression), spark block aggregation, and codegen templates but keep the function objects around. If you want to handle the compression framework yourself, we only need to synchronize on a few things that are shared (e.g., opcodes, and internal APIs).

Let's keep this thread open for a week though to hear and discuss potential concerns.

Regards,
Matthias

On 1/17/2021 10:17 PM, Baunsgaard, Sebastian wrote:
Hi Matthias,


I agree to remove it, especially because of the double size allocation of 
outputs.


 From the point of the compression framework,

it is also making things more complicated than needs to be.

While removing it would you also remove it from the compression part of the 
system?


Regards

Sebastian

________________________________
From: Matthias Boehm <[email protected]>
Sent: Sunday, January 17, 2021 10:04:14 PM
To: [email protected]
Subject: Kahan Addition

Hi all,

experiments with lower precision levels and HW with higher memory
bandwidth have shown that our use of Kahan Addition becomes increasingly
costly. In the past, the overhead of computing corrections was hidden by
I/O and memory bound-operations like unary aggregates (e.g., rowSums,
colSums, sum) but this is changing.

So, for reasons of performance and code complexity, I intend to entirely
remove Kahan addition from our system (in unary aggregates and partial
aggregation). If there are objections or concerns, let's discuss them.

Regards,
Matthias

Reply via email to