thanks for the feedback. I would remove it from all operations (incl
compression), spark block aggregation, and codegen templates but keep
the function objects around. If you want to handle the compression
framework yourself, we only need to synchronize on a few things that are
shared (e.g., opcodes, and internal APIs).
Let's keep this thread open for a week though to hear and discuss
potential concerns.
Regards,
Matthias
On 1/17/2021 10:17 PM, Baunsgaard, Sebastian wrote:
Hi Matthias,
I agree to remove it, especially because of the double size allocation of
outputs.
From the point of the compression framework,
it is also making things more complicated than needs to be.
While removing it would you also remove it from the compression part of the
system?
Regards
Sebastian
________________________________
From: Matthias Boehm <[email protected]>
Sent: Sunday, January 17, 2021 10:04:14 PM
To: [email protected]
Subject: Kahan Addition
Hi all,
experiments with lower precision levels and HW with higher memory
bandwidth have shown that our use of Kahan Addition becomes increasingly
costly. In the past, the overhead of computing corrections was hidden by
I/O and memory bound-operations like unary aggregates (e.g., rowSums,
colSums, sum) but this is changing.
So, for reasons of performance and code complexity, I intend to entirely
remove Kahan addition from our system (in unary aggregates and partial
aggregation). If there are objections or concerns, let's discuss them.
Regards,
Matthias