Unique Kylin coprocessor per table causes excessive disk usage

Árki Gábor Fri, 16 Jul 2021 04:04:08 -0700

 Dear All,

We have been noticing an issue with our long-running Kylin clusters (Kylin
3.1.0 and HBase 1.4.10 / EMR 5.28).
As our data grows, the HBase region servers are running out of disk space.
This seems to be happening because Kylin is configuring a uniquely named
coprocessor jar for each table. HBase region servers are downloading these
jars to a tmp folder but probably because each table has a uniquely named
jar, this coprocessor is now duplicated as many times as there are tables
created by Kylin. This issue has been reported in KYLIN-5022
<https://issues.apache.org/jira/browse/KYLIN-5022> by someone else and I
also added some findings.


For now, the only workaround I found was extending the disk size for our
clusters but that is not a great solution for scaling and cost perspective.
Is there a way to reconfigure this behavior? Is it even intentional to use
a unique jar name for every table?

Regards,
Gabor

Unique Kylin coprocessor per table causes excessive disk usage

Reply via email to